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The  Markov  chain  model,  with  extensions  to  cover  the  phenomena 
of  arrivals  and  departures,  was  applied  to  a  population  of  savings  accounts, 
in  a  savings  institution,  to  forecast  the  size  distribution,  total  number 
of  accounts  and  total  amount  of  savings  of  the  population.     The  stochastic 
processes  governing  the  behavior  of  the  population  were  first  assumed  to 
be  time  stationary.    This  assumption  was  then  relaxed  and  an  econometric 
model  was  used  to  predict  future  values  of  the  parameters  of  the  non- 
stationary  model.     Both  models  were  validated  by  comparing  predicted 
size  distributions,  total  number  of  accounts  and  total  amount  of  savings 
against  observed  values.    The  chi  square  goodness  of  fit  test  was  used 
in  the  comparison.     The  fundamental  matrix  of  the  stationary  model  was 
also  used  to  predict  the  equilibrium  distribution  and  related  measures  of 
the  population. 
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I.    INTRODUCTION 

A.  PURPOSE 

It  is  the  purpose  of  this  thesis  to  develop  and  evaluate  two  analytical 
models  which  can  be  used  to  forecast  the  structure  of  a  population  of  savers 
and  the  level  of  savings  of  a  savings  institution.    The  population  of  savers 
is  divided  into  a  finite  number  of  classes  and  the  structure  is  the  distri- 
bution of  savers  among  the  classes. 

B.  BACKGROUND 

While  it  is  difficult,  if  not  impossible,  to  predict  the  future  behavior 
of  an  individual  it  is  believed  that  the  aggregate  behavior  of  a  population 
is  less  erratic  and,  therefore,  more  amenable  to  analysis  and  prediction. 
Assuming  that  a  large  population  has  considerable  inertia,  current  trends 
can  be  used  to  project  into  the  future. 

The  rate  of  change  of  the  structure  and  characteristics  of  a  popu- 
lation can,  at  times,  be  considered  to  be  dependent  upon  its  size, 
external  forces  which  affect  the  members  of  the  population  and  the 
response  to  these  forces. 

In  the  case  cf  the  population  of  savers  in  savings  institutions,  it 
has  been  observed  that  members  of  this  population  are  not  very  responsive 
to  changes  in  economic  conditions.    Thus,  during  periods  of  constant 
rate  of  expansion  or  contraction  in  the  business  cycle,  external  forces 
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affecting  this  population  may  be  considered  to  be  constant  and  a  time 
stationary  Markov  Chain  model  may  be  used  to  study  the  behavior  of  the 
population. 

However,  during  turning  points  in  the  business  cycle  or  periods 
of  rapid  economic  changes,  external  forces  may  be  sufficiently  large  to 
affect  the  savings  pattern  of  the  savers  so  that  the  stationarity  assumption 
may  no  longer  hold.     Under  these  circumstances  a  more  comprehensive 
model  which  takes  into  consideration  the  effects  of  external  conditions 
on  the  behavior  of  the  savers  would  be  required.    The  major  problem  in 
constructing  such  a  model  would  be  in  discovering  the  factors  which 
affect  the  population,  measuring  the  effect  of  these  factors  and  the  effects 
of  interaction  between  various  factors . 

The  effect  of  competition  between  various  savings  institutions  for 
a  larger  share  of  the  savers'  market  could  not  be  modeled  because  of  the 
lack  of  data.     However,  it  is  believed  that,  in  the  short  run,  the  savers' 
market  is  in  a  state  of  equilibrium  and  the  share  of  the  market  captured 
by  a  savings  institution  is  relatively  constant.     Thus  it  can  be  assumed 
that  competition  does  not  affect  the  savers'  behavior  to  such  an  extent 
that,  not  considering  its  effect,  would  render  any  model  inadequate. 

C.         REVIEW  OF  MARKOV  CHAIN  MODELS 

The  basic  model  used  in  this  study  was  introduced  by  A.  A.  Markov 
(1856-1922)  around  1907.     This  model  was  first  applied  in  economics  to 
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the  analysis  of  income  and  wage  distributions  by  Solow  £2  lj  in  1951. 
The  same  model  was  used  by  Hart  and  Prais£l2jin  1956  in  a  study  of 


business  concentration. 

The  model  assumes  that  a  population  of  entities  can  be  classified 
into  a  finite  number  of  classes.     The  population  is  observed  at  equi- 
distant time  points.     The  number  of  entities  observed  to  move  from  one 
class  to  another  is  assumed  to  be  generated  by  a  stochastic  process. 
The  probability  of  transition  is  assumed  to  depend  only  on  the  class  the 
entity  is  in,  at  the  current  time  interval,  and  not  on  where  it  had  pre- 
viously been.     This  process  of  change  can  be  completely  described  by 
a  transition  matrix,  P,  as  shown  below.     The  p..  element  is  the  probability 
that  an  entity  currently  in  the  ith  class  will  be  found  in  the  jth  class 
after  one  time  period.    If  the  stochastic  process  is  time  stationary  then 
the  matrix  does  not  change  with  time. 


Beginning  I 

in 

Class  II 


M 


P  MATRIX 


Ending  in  Class 


II 


Pll  P12 


P21  P22 


Pml        Pm2 


M 


lm 


2m 


mm 
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In  most  of  the  research  studies  using  this  model  the  general  pro- 
cedure has  been  to  observe  some  pattern  of  change  over  time  and,  assuming 
that  the  stochastic  process  is  time  stationary,  estimate  the  transition 
probabilities  and  project  the  future  change. 

Projection  of  expected  number  of  entities  in  each  class  can  be  com- 
puted as  follows: 

let  the  number  of  entities  in  each  class  at  time  t  be  n,  ,  nn  ,   .   .    .  n    . 

12  m 

If  the  transition  probabilities  are  known  then  the  expected  number  of  entities 

r      ,  •     ,  ,  t  t  t 

moving  out  of  the  ith  class  is  p.  nn. ,  p.0n.  ,    .    .   .  p.    n. . 

ill       i2    i  1m   1 

The  expected  number  of  entities  in  each  class  at  time  t+1  can  be 

found  by  adding  up  all  the  entities  that  have  moved  into  the  class  and 

those  that  did  not  move  out.    Thus 

t+1  t  t  t 

n  =    n   p       +n   p       +  .    .    .  n    p 

1  111         ll\  mml 

t+1  t  t  t 

n2        =    n2p21  +  n   P22  +  .   .   .  V^ 


as: 


t+1  t  t  t 

n  =np1+np_+...np 

m  m  ml        m  m2  m  mm 


In  matrix  notation  the  above  expressions  can  be  compactly  written 


__t+l  __t      _ 

N  =    N    x  P 

where  N    =  (n.  n„  .   .    .  n    ),  a  1  x  m  vector 

12  m 

..t+1       ,   t+1   t+1  t+L 

N        =  (n.     n„  .   .    .  n       ),  a  1  x  m  vector 

1       l  m 

P  =  matrix  as  defined  earlier. 
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N         can  be  computed  by  replacing  N    by  NL       in  the  above  expres- 
sion.    This  is  equivalent  to  multiplying  N    by  P  x  P.    The  distribution  after 
k  periods  can  thus  be  obtained  by  multiplying  N    by  P  raised  to  the  kth 
power . 

This  basic  model  has  two  major  limitations.     First,  it  assumes  that 
the  total  number  of  entities  in  the  system  is  fixed.    This  assumption  has 
been  violated  frequently  in  practical  applications  of  this  model  as  changes 
due  to  entities  entering  the  system,  leaving  it  or  losing  identity  by  merging 
are  the  rule  rather  than  exception.    Second,  the  assumption  that  the 
stochastic  process  is  time  stationary  is  untenable  over  long  periods. 
Changes  in  numerous  exogenous  variables  such  as  wage  rates,  technology 
and  legal  requirements  are  likely  to  result  in  changes  in  the  transition 
probabilities. 

Adelman  £lj  in  1958,  overcame  the  first  limitation  by  introducing 
the  concept  of  a  reservoir  of  potential  entrants,  from  which  entrants  may 
come  and  to  which  exants  may  go.    There  was  an  operational  difficulty  in 
estimating  the  size  of  the  population  of  potential  entrants.    However, 
Adelman  pointed  out  that  the  exact  size  of  this  population  need  not  be 
known  if  one  was  dealing  with  the  proportion  of  entities  in  each  class 
rather  than  with  the  exact  number  of  entities.    She  therefore  assumed 
that  the  size  of  the  reservoir  to  be  100,000.    The  reason  given  for  this 
choice  was  that  it  must  be  large  relative  to  the  number  of  entities  in  the 
system . 
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Stanton  and  Kettunen  £22j  in  1967  confirmed  Adelman's  observation 
but  went  on  to  demonstrate  that:     "The  number  of  potential  entrants  to  an 
industry  or  to  a  population  has  a  definite  and  measurable  effect  on  subse- 
quent projections  made  for  that  distribution  when  Markov  processes  are 
used."    Thus,  if  the  number  of  entities  in  each  class  is  required,  an 
arbitrary  choice  of  the  size  of  the  population  of  entrants  will  not  work. 

Duncan  and  Lin£9jin  19  72  proposed  that  arrivals  could  be  treated 

as  a  separate  stochastic  process.     The  entry  of  an  entity  into  the  system 

is  viewed  as  a  two-stage  process;  first,  arriving  into  the  system,  then 

entering  into  a  particular  class.     One  could  then  estimate  the  parameters 

of  the  entire  process  by  observing  the  arrivals,  the  distribution  of  arrivals 

among  the  classes  and  the  transitions  between  classes  separately.     He 

denoted  the  data  by  Z  which  was  composed  of  the  number  of  arrivals  into 

each  class  (A)  and  the  number  of  transitions  between  each  class  (X).     The 

set  of  parameters  of  the  process  was  denoted  by  0  =  (P,p,    ~l    )  where  P 

was  the  transition  matrix,  p  was  the  multinomial  vector  of  probability  of 

an  arrival  entering  a  particular  class  and    "&-     was  the  vector  of  parameters 

of  the  arrival  distribution.     The  sampling  distribution  was  then  written  as 

follows: 

f0(z)    =    fQ(x,a)  =  fe(x|A=a)f9(a) 

=    f  (x|A=a)f,  (a) 

P      '  (P,  %    ) 

The  likelihood  function  could  then  be  written  as 
z  x    A=a  a 
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Three  reasons  were  given  for  the  importance  of  the  factorization 

shown  above: 

"a.         The  first  factor  L    ,  ,       (P)  depends  on  Z  only  through 

x  I  A=a 

the  transition  counts; 

b.  The  second  factor  L   (p,  7t     )  depends  on  Z  only  on  the 

a 

observed  entries;  and 

c.  Likelihood  inference  is  reduced  to  two  distinct  and 
simpler  problems .  " 

Anderson  and  Goodman  £2^  in  1957  proposed  a  number  of  statistical 
tests  for  the  following  hypotheses 

a.  that  the  transition  probabilities  of  a  first  order  chain 
are  constant; 

b.  that  in  case  the  transition  probabilities  are  constant, 
they  are  specified  numbers; 

c.  that  the  process  is  a  uth  Markov  chain  against  the 
alternative  it  is  rth  but  not  uth  order. 

Because  of  the  factorization  of  the  likelihood  function  Duncan  and 
Lin  concluded  that  the  methods  of  Anderson  and  Goodman  are  applicable 
to  a  system  with  changing  number  of  entities. 

Hallberg  £ll7in  19  69  challenged  one  of  the  most  demanding  assump- 
tions of  the  Markov  chain  model  that  the  transition  probabilities  are 
constant  regardless  of  time.     He  proposed  to  overcome  this  problem  by 
relating  transition  probabilities  to  economic  variables  and  to  use  these 
relations  to  predict  future  values  of  transition  probabilities.     For  some 
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unknown  reasons  he  regressed  transition  probabilities  against  the  logarithms 
of  exogenous  variables.     Some  predicted  transition  probabilities  did  not 
fall  within  the  range  of  zero  to  one  range.     He  then  suggested  setting 
negative  predictions  to  zero  and  to  normalize  each  row  of  the  transition 
matrix  by  dividing  each  element  by  the  row  sum. 

D .         REMARKS 

Despite  the  limitations  of  the  basic  Markov  chain  model  it  has  been 
successfully  used  in  a  variety  of  situations.    The  Duncan  and  Lin  approach 
extends  the  basic  model  to  include  arrivals  and  departures.    This  can  be 
done  with  little  additional  effort.    To  extend  the  model  to  cover  the  possi- 
bility of  non-stationary  transition  probabilities  is  a  considerably  more 
difficult  task.    The  first  problem  is  acquiring  a  data  base  which  is  large 
enough  to  yield  precise  estimates  of  transition  probabilities.     The  data 
must  also  span  a  long  period  so  that  the  factors  which  affect  the  transition 
probabilities  have  an  opportunity  to  vary.     The  second  problem  is  to 
identify  these  factors  and  to  obtain  a  functional  relationship  between 
transition  probabilities.     The  third  problem  is  to  predict  the  future  values 
of  these  factors.    The  prediction  of  the  non- stationary  Markov  chain  model 
is  only  as  good  as  the  prediction  of  these  factors.     The  approach  suggested 
by  Hallberg  can  be  improved  by  transforming  the  estimates  of  transition 
probabilities  into  logits  (the  logarithm  of  the  estimates  of  odds  of  transition) 
This  will  ensure  that  the  predicted  transition  probabilities  are  between 
zero  and  one . 
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The  basic  Markov  chain  model  is  used  in  this  paper  to  model  the 
behavior  of  a  population  of  savers  at  a  savings  institution.  The  Duncan 
and  Lin  approach  is  used  to  treat  the  phenomena  of  entries  and  exits.  A 
nonstationary  Markov  chain  model  (Model  II  of  this  paper)  has  also  been 
developed.  The  parameters  of  the  models  were  estimated  with  data  from 
five  quarters.  The  models  were  then  validated  with  data  from  the  following 
five  quarters.  The  Chi-square  Goodness  of  fit  test  was  used  to  compare 
the  predictive  power  of  the  two  models. 
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II.     MODEL  OF  A  POPULATION  OF  SAVERS 

A.  GENERAL 

The  population  of  savers  is  first  divided  into  m  classes  by  the 
acount  of  savings  each  saver  has  in  his  savings  account.     Each  saver  is 
free  to  increase  or  decrease  his  savings  and  to  leave  the  savings  insti- 
tution by  closing  his  account.    The  population  is  observed  periodically. 
A  projection  of  the  structure  of  the  population  and  the  amount  of  savings 
in  each  class,  based  on  these  observations,  is  desired.    A  Markov  chain 
model  can  be  used  for  this  purpose  provided  the  basic  assumptions  of 
the  model  are  not  violated. 

B.  ASSUMPTIONS 

1.  The  probability  that  an  account  moves  from  class  i  to  class  j 
depends  only  on  class  i  and  does  not  depend  on  the  past  history  of  the 
account.    This  is  obviously  not  true  for  an  individual  account  but  possibly 
holds  for  the  population  of  a  given  class. 

2.  Each  saver  acts  independently  of  other  savers.    If  savers 
act  in  unison  then  a  Markov  model  will  fail  as  the  assumption  of  inde- 
pendence is  no  longer  valid.    However,  the  assumption  generally  holds 
even  if  savers  are  affected  by  the  same  factors.    The  transition  proba- 
bilities may  shift  because  of  these  factors  but  the  randomness  in  action 
of  individual  savers  is  still  there. 
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3.  The  distribution  of  the  size  of  accounts  within  a  class  is 
independent  of  the  number  of  accounts  that  move  in  or  out  of  that  class. 
This  assumption  is  not  required  for  Markov  model  but  is  necessary  if  one 
has  to  determine  the  amount  of  savings  from  a  knowledge  of  the  number 
of  accounts  in  each  class.     This  assumption  is  generally  true  if  the 
number  of  accounts  in  each  class  is  large  relative  to  the  net  change  in 
each  period.     This  assumption  can  be  violated  if  the  number  of  accounts 
in  each  class  is  small  and  if  the  class  boundaries  are  wide. 

4.  The  transition  probabilities,  arrival  rate  and  the  distribution 
of  entrants  among  states  are  time  stationary.     This  assumption  may  hold 
during  periods  of  constant  expansion  or  contraction  of  the  business  cycle. 
However,  it  is  not  expected  to  hold  over  long  periods  and  during  times 
when  external  forces  change  the  saving  pattern  of  savers.    This  assump- 
tion is  relaxed  in  Model  II  where  an  attempt  was  made  to  discover  their 
functional  relationship  with  economic  factors  and  other  exogenous 
variables . 

C.        DESCRIPTION  OF  MODEL  I 
1 .         The  Transition  Matrix 

Model  I  has  only  one  stochastic  process,  the  basic  Markov 
chain  model.    The  number  of  arrivals  is  considered  to  be  constant  and  the 
proportion  of  arrivals  entering  each  class  is  also  constant. 

Let  m     =    total  number  of  classes  including  one  class  of  closed 
accounts 


21 


time,  measured  in  periods,  0,1,2 


the  accumulated  number  of  accounts  that  have  closed 


at  time  t 


f: 


c; 


<4  4  •  •  •  0 

16  m 

number  of  accounts  in  each  active  class  at  t 

,   t     t  t  , 

c     c  .    .  c    ) 

2     3  m 

number  of  new  accounts  entering  each  active  class 


at.  time  t 
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P21    P22 
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ml      m2  mm 


Let  class  1  be  the  class  which  contains  all  the  closed  accounts. 
It  is  assumed  that  an  account  in  the  inactive  state  will  not  re-enter  the 
active  states .    Thus  p       =  1.0  and  p      =  0 . 0 ,  j  =  2   .    .    .  m 

The  expected  number  of  accounts  at  time  t  can  be  computed  from 
the  following  relationship    9    : 

t-1 


E(e    f')  =  (0  f'  )p   +  (0  C)      Y      P. 


where  t  =  0 , 1  .    .   .  T  and  P    =  I 
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The  first  term  on  the  right  of  the  equality  sign  is  the  expected 
number  of  accounts  in  each  class  at  time  t  from  the  original  population 
f'n  .     The  second  term  is  the  expected  number  of  accounts  in  each  class 
derived  from  those  accounts  which  join  the  system  at  each  period.     Thus, 
the  accounts  that  arrive  by  period  1  would  have  undergone  t-1  periods  of 
transition.     Those  that  arrive  by  period  2  would  have  undergone  t-2  periods 
of  transition.       Those  arriving  at  time  t  would  undergo  no  transition     as 


po  =  I- 


As  the  stochastic  process  has  been  assumed  to  be  time 


stationary  the  elements  of  the  P  matrix  are  constant  and  P    is  just  the 

,  _  .         ,       ,         .        th 

single  period  P  matrix  raised  to  the  t      power. 

The  expected  total  number  of  accounts  in  the  system  at 
time  t  is  just  the  sum  of  the  elements  of  f' . 

If  the  size  distribution  of  accounts  within  each  class  is  con- 
stant over  the  period  of  prediction,  then  the  amount  of  savings  in  each 
class  can  be  estimated  by  multiplying  the  expected  number  of  accounts 
by  the  average  amount  of  savings  in  that  class. 
2  .         The  Equilibrium  Distribution 

If  prevailing  conditions  were  to  persist  the  structure  of  the 

population  will  reach  an  equilibrium  in  which  the  number  of  accounts 

leaving  each  class  is  balanced  by  an  equal  influx  of  accounts  from  the 

other  classes.     The  limiting  distribution  is  given  by  £9  J : 

Lim  E(e  f)  =  ( oo      C'(I  -  Q)"1) 
n-»  <*> 
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where  Q  is  the  sub- matrix  of  P  obtained  by  removing  the  column  of 
transition  probabilities  from  the  classes  of  active  accounts  (Class  II  to 
Class  XI)  into  the  class  of  closed  accounts  (Class  I),  and  the  row  of 
transition  probabilities  of  the  class  of  closed  accounts. 

The  matrix,   (I  -  Q)      ,  is  often  called  the  fundamental  matrix, 
denoted  by  M .     The  ij       element  of  this  matrix  is  the  expected  number  of 
periods  that  a  new  account  entering  class  i  when  it  joins  the  system 
will  spend  in  class  j  before  closing. 

The  expected  number  of  periods  that  a  new  account  entering 
class  i  when  it  joins  t\  i  system  will  remain  in  the  system  can  be  found 
by  summing  the  i      row  of  the  fundamental  matrix. 

The  above  results  and  further  treatment  can  be  found  in 
Chapter  3  of  Ref.fl3j 

3 .         Prediction  Interval  for  Single  Step  Transition 

The  predictions  made  with  Model  I  are  point  estimates.    They 
do  not  provide  any  information  as  to  how  close  they  could  be  to  a  future 
observation.    A  prediction  interval  which  gives  the  range  of  values  that 
a  future  observation  would  take  say  ninety  percent  of  the  time  would  be 
of  greater  value  to  a  decision  maker. 

The  number  of  accounts  in  each  class  is  the  sum  of  m  binomial 
random  variables.     If  the  number  of  accounts  in  each  class  is  large  then 
the  binomial  random  variables  can  be  approximated  by  normal  random 
variables.     The  sum  of  normal  random  variables  is  another  normal  random 
variable.     Thus  a  prediction  interval  can  be  constructed  using  this 
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approximation.     For  one  step  transition  the  prediction  interval  can  be 
easily  constructed.     However,  for  more  than  one  step  transitions  the 
task  of  constructing  a  prediction  interval  becomes  rather  difficult.     The 
problem  is  that  after  the  first  transition  the  number  of  accounts  in  each 
class  becomes  random  and  the  expression  for  the  unconditional  variance 
of  the  number  of  accounts  becomes  quite  unmanageable.    The  expressions 
for  the  variance  of  the  number  of  accounts  in  each  class,  the  total  number 
of  accounts,  the  amount  of  savings  in  each  class  and  the  total  amount 
of  savings  for  single  step  transition  are  listed  below.     The  derivation  can 
be  found  in  Appendix  A. 

Let       n.  be  the  number  of  accounts  in  class  j  at  beginning  of 


J 


time  period  a 


p. .  be  the  transition  probability  of  an  account  from 

ij 

class  i  to  j 

N  be  the  number  of  accounts  in  the  system  at  beginning 

of  time  period  a 
Z.       •     be  the  amount  of  savings  in  class  j  at  beginning  of 


time  period  a 


Z  be  the  total  amount  of  savings  in  the  system  at 

beginning  of  time  period  a 

,  ,  m 

Var(n.       )  =       >        n.  p.  .(1  -  p. .) 

Var(Na+1j         =      J?"    Var(na+1)  +  2    V      f     Cov(na+1,  nf+1) 

j  <  k 
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_     ,  a+1       a+1*  <f"        ,  a  . 

Covn,        ,  n        )    =  >  in.  p.. P.,  ) 

j  k  £3  i     ij    ik 

jA 

Let  z,  ,  be  the  amount  of  savings  in  an  account  which  has  moved 

kj 

into  class  j 


Var(ZS+1)  =       J     nap..Var(z,  .)  +E2(z,  ,)nap..(l  -  p..) 

3  ^       i     ij  kj  kj     i     ij  ij 


— 
Cov(Za    1,zf1)         =       g    -<nfpi.Pil)E(zk.)E(zkl) 

Var(2a+1)  =       f    Var(Za+1)  +  2     ^        5~   Cov(Za+1,Za+!) 

fe2  fe2  1=3  J  l 

Using  these  expressions  the  prediction  intervals  for  a  single 

step  transition  are  as  follows: 

90%  Prediction  Interval 

■j    fn 

of  number  of  accounts  in  class  i  =  f.  +  1.645  x  (Var(n.)) 

1  —  1 

V-  1/2 

of  total  number  of  accounts  =         >     f .  +  1 .  645  x  (Var(N)) 

j=2    J" 

1/2 
of  amount  of  savings  in  class  i  =  s.  +  1.645  x  (Var(Z.)) 

1/2 

of  total  amount  of  savings  =       )       s.  +  1.645  x  (Var(Z)) 


4-9         J 


3=2 

where 

f.  =        expected  number  of  accounts  in  class  i  =  E(n.)  after 

one  period 
s.         =       expected  amount  of  savings  in  class  i  =  E(Z.)  after 

one  period 
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N  =        total  number  of  accounts  in  the  system  after  one 

period  (random  variable) 

n.  =        number  of  accounts  in  class  1  after  one  period  (random 

1 

variable) 
Z.  -        amount  of  savings  in  class  i  after  one  period  (random 

variable) 
Z  =        total  amount  of  savings  in  the  system  after  one  period 

(random  variable) 
The  model  can  be  extended  to  cover  the  case  of  stochastic 
arrivals.    Assuming  the  arrival  process  to  be  independent  of  the  Markov 
chain  process  the  expression  for  the  number  of  accounts  is  the  same  as 
for  the  case  of  non-stochastic  arrivals.    The  only  difference  is  in  re- 
placing the  vector  of  entrants  (c1)  by  the  product  of  the  expected  number 
of  arrivals  and  the  multinomial  vector  of  probability  of  entering  each 
active  class.    Thus, 

c'  =        E(R)  (p.  p,   .    .    .  p    ) 

l     o  m 

where  R  =  random  number  of  arrivals 

p.  ,  i  =  2  ,  3   .    .    .  m  =  probability  of  an  arrival  entering  class  i 

c'        =  vector  of  entrants  into  the  active  classes 

The  expressions  for  variance  are  changed  to  take  into  account 

the  variability  introduced  by  the  additional  stochastic  processes. 

a+1 

Let  e.  be  the  random  number  of  entrants  into  class  j  at  time 

period  a+1 
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r°         be  the  number  of  arrivals  at  time  period  a+1 


R  be  the  random  variable  of  arrivals 


„     ,  a+1  I  D        a+1.  a+1     .  . 

Var(e.        |  R  =  r       )    =    r       p.(l  -  p.) 


a+i 


Var(e.  '  x  ) 
J 


=    p  (1  -  p  )E(R)  +  p.    Var(R) 
J  J  J 


Since  arrivals  have  been  assumed  to  be  independent  of  the 


accounts  in  the  system 


m 


Var(na+1)    =    Var(ea+1)     +  nap.(l-p..) 


Var 


(Na+1)    =      Y    Var(na+1) 
3=2 


m-1         m 


+    2 


I 

j=2 


_,      ,  a+1       a+1. 
Cov(n.       ,  n        ) 


_      ,  a+1       a+L 

Cov(n.       ,  nk     )    =         2_ 


Let  E(Z.)  be  the  expected  amount  of  savings  in  an  account  in  class  j 
z,  ,      be  the  amount  of  savings  in  an  account  which  has  just  entered 
class  j 

+  i  ?  m 

Var(Za    l)    =    E(Z.)    p.(l  -  p.)    +      Y  naP.  .Var(z,  .)    + 
i  J       J  J  ^2     i    ij  kj 

E(z,  .)2nap..(l  -  p.) 
kj      i    i]  i 


CovfZ^zf1)    = 


m 


- (n  Vn)E(zw)E(^i) 


Var(Za+1)    = 


Va 


r(Za+1)    +2      >  Y  Cov(Za+1  ,zf l) 
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D.         DESCRIPTION  OF  MODEL  II 
1 .         The  Arrival  Process 

It  was  observed  that  the  number  of  new  accounts  opened  in 
each  quarter  was  between  seven  hundred  and  one  thousand.     For  such 
large  arrival  rates,  an  assumption  that  the  arrival  rate  is  normally  dis- 
tributed would  be  reasonable.     However,  it  was  felt  that  the  arrival 
distribution  could  be  affected  by  external  factors  like  state  of  the  national 
economy,  seasonal  effects  and  level  of  promotional  or  advertising  activity 
of  the  savings  institution.     Thus  the  following  linear  econometric  model 
was  considered: 

■        ValXl+a2X2    '    •    •  a10X10+e 

=        Number  of  new  accounts  opened  in  each  quarter 

=        Dummy  variable  for  quarters  of  the  year 

=        California  non-agricultural  employment 

=       Advertising  and  promotional  expense  of  the  savings 
institution 

=  Prime  commercial  paper  rate,  4-6  months 

=  U.  S.  Government  securities  rate,  6  months 

=  Corporation  bonds  rate 

=  Wholesale  price  index 

=  U.  S.  Government  securities  rate,  3  months 

=  California  personal  income 

U.  S.  total  credit 

=        Normally  distributed  random  variable  with  zero 
mean  and  constant  variance 
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The  linear  model  was  selected  because  of  its  simplicity 
and  because  of  the  lack  of  data  required  by  more  complex  models. 

2  .         The  Size  Distribution  of  New  Accounts 

The  size  distribution  of  new  accounts  may  also  change  with 
time  and  external  conditions.     To  model  this  change,  the  probabilities 
of  new  accounts  entering  each  class  were  related  to  the  same  set  of 
exogenous  variables  listed  in  sub-section  1.     Direct  application  of  least 
squares  to  the  probabilities  may  yield  predictions  of  future  values  that 
are  outside  the  zero  to  one  range.     To  overcome  this  potential  area  of 
difficulty  the  estimates  of  the  probabilities  were  first  transformed  into 
logits . 

3  .         Logit  Analysis 

Logit  analysis  is  a  special  application  of  Econometrics  to 
situations  in  which  the  dependent  variable  has  a  dichotomous  character. 
The  object  is  to  estimate  the  probability  of  occurrence  of  a  specified 
event  given  a  set  of  prevailing  conditions.     For  application  in  this  study 
one  looks  for  the  probability  that  a  new  account  enters  a  particular  class 
and  the  probability  that  an  account  will  move  from  one  class  to  another, 
given  a  set  of  external  conditions.    Direct  application  of  least  squares 
may  result  in  the  prediction  of  probabilities  outside  the  zero  and  one 
range.    A  monotonic  transformation  can  overcome  this  difficulty.     One 
simple  transformation  is  to  divide  the  relative  frequency  by  one  minus 
the  relative  frequency.     This  quantity  is  an  estimate  of  the  odds  of  the 
occurrence  of  the  event.    This  transformation  is  still  restrictive  as  the 
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new  variable  can  take  on  only  positive  values.     This  problem  can  be 
overcome  by  taking  the  logarithm  of  this  quantity.     The  logarithm  of  the 
estimated  odds  is  termed  the  logit  of  an  event.     The  model  used  to  predict 
future  values  of  the  parameters  of  the  entrants  distribution  and  the  tran- 
sition probabilities  of  the  transition  matrix  was  as  follows: 

Log(p./(l  -  p.))  =  a0+aiX1+a2X2  +  .   .   .  a^X^  +  e 

Log(p../(l  -  p..))  =b0+blXlb2X2  t  .   .   .  b1QX10  +  e 

There  is  a  further  restriction  that  the  sum  of  the  probabilities 
of  the  entrants  distribution  must  equal  one  and  the  row  sum  of  the  tran- 
sition matrix  should  equal  one  too.    The  approach  taken  in  this  paper 
was  to  sum  up  these  predicted  probabilities  and  then  divide  each  by  the 
sum . 

4.  Transition  Matrix  of  Model  II 

The  transition  matrix  of  Model  II  is  allowed  to  change  with 
external  factors  thus  the  t  steps  transition  matrix  is  no  longer  the  single 
step  matrix  raised  to  the  t      power  but  is  the  product  of  t  matrices. 

5 .  Predictions  with  Model  II 

To  use  Model  II  the  first  step  would  be  to  obtain  predictions 
of  future  values  of  those  factors  that  are  in  the  regression  equations. 
The  parameters  of  the  arrival  process,  entrance  process  and  the  transition 
probabilities  are  then  predicted.    The  expected  number  of  accounts  in 
each  class  can  then  be  computed  by  the  following  expression: 
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active  class  at  time  t,  t=0 ,   1  .  .  .T 


Transition  matrix  at  time  j,  j=0,    1,    .  .  .T 
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Transition  matrix  at  time  k,  k=0 ,   i, 

r,/.      tw      t  t  t     , 
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i     5  m 
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Vector  of  expected  number  of  entrants  in  each  active 


class  at  time  t. 
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Expected  total  number  of  active  accounts  in  the  system 


f.  x  E(z.) 
J  J 


ECZJ) 
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Expected  total  amount  of  savings  in  class  j 


Average  amount  of  savings  in  each  account  in  class  j 
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Expected  total  amount  of  savings  in  the  system  at 


time  t 
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III.     THE  DATA  AND  ESTIMATION  OF  PARAMETERS 

A.         DESCRIPTION  OF  DATA  BASE 
1 .         General 

The  data  used  in  this  study  was  obtained  from  the  local  branch 
of  a  savings  institution.    The  population  of  passbook  accounts  was  selected 
for  study  as  it  has  greater  mobility  than  other  types  of  savings  accounts. 

The  quarterly  earnings  ledgers  for  1971,   1972  and  the  first 
two  quarters  of  19  73  were  made  available  for  this  study.     The  quarterly 
earnings  ledgers  contain  the  following  information  which  have  a  bearing 
on  this  study: 

1.  Identification  number  of  each  active  account. 

2.  Amount  of  savings  as  of  the  last  day  of  each  quarter. 

3.  Amount  of  earnings  for  the  quarter. 

4.  Summary  statistic  of  total  number  of  active  accounts. 

5.  Summary  statistic  of  total  amount  of  savings. 

6.  Summary  statistic  of  total  earnings  withdrawn. 

7.  Summary  statistic  of  total  earnings  accrued. 

The  basic  Markov  chain  model  requires  the  initial  distribution 
of  the  subject  population  and  the  transition  probability  matrix  for  complete 
specificaion.    A  preliminary  sample  of  two  hundred  accounts  showed  that 
seventy-two  percent  of  the  population  would  have  balances  below  two 
thousand  dollars.    A  very  large  random  sample  would,  therefore,  be  re- 
quired to  pick  out  the  behavior  of  large  accounts .    It  was  decided  to  pick 
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a  stratified  sample  instead.    Thus,  the  sample  of  accounts  examined 
consisted  of  three  blocks  of  about,  two  hundred  each.     The  first  consisted 
of  accounts  with  balances  exceeding  ten  thousand  dollars  on  31  March 
1971.    The  second  block  consisted  of  accounts  between  two  to  ten  thous- 
and dollars  and  the  third  block  consisted  of  accounts  below  two  thousand 
dollars.    The  quarterly  balance  of  each  account  was  recorded.    To  determine 
the  initial  distribution  of  the  population,  the  amount  of  savings  of  all  the 
accounts  with  balances  exceeding  one  thousand  dollars  on  31  March  1972 
were  recorded.    The  accounts  were  sorted  by  their  order  of  magnitude  and 
then  divided  into  ten  classes.    The  class  intervals  were  selected  to  en- 
sure that  the  amount  of  savings  in  each  class  was  of  the  same  order  of 
magnitude.    The  first  eight  classes  uniformly  spanned  the  interval  $1  - 
$15,999.    The  ninth  class  contained  all  accounts  between  $16,000  - 
$19,999  and  the  tenth  class  covered  the  range  from  $20,000  -  $100,000. 
Accounts  exceeding  $100,000  were  rare;  there  were  six  of  them  in  the 
31  March  19  72  population.     Including  them  in  the  largest  class  could 
result  in  an  unstable- mean  of  the  amount  of  savings  in  that  class;  they 
were  thus  eliminated  from  the  population.    It  is  believed  that  these  large 
accounts  are  important  in  the  prediction  of  total  acount  of  savings  and 
should,  therefore,  be  treated  separately.     For  the  purpose  of  this  study 
the  amount  of  savings  for  accounts  exceeding  $100,000  was  considered 
to  be  unchanged  over  the  period  of  observation. 
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2  .         Arrival  Rate 

The  arrival  rate  was  determined  by  taking  the  difference 
between  the  last  identification  numbers  of  consecutive  quarters.     This 
method  failed  to  provide  an  accurate  estimate  of  the  arrival  rate  for 
Quarter  IV-72.     It  was  subsequently  learned  from  the  management  that 
a  block  of  about  two  hundred  accounts  were  used  to  facilitate  some 
financial  transactions  of  newly  arrived  servicemen  to  Monterey.     These 
accounts  were  subsequently  closed.    With  this  information  the  arrival 
rate  for  Quarter  IV-72  was  accordingly  reduced. 

3 .  The  Size  Distribution  of  New  Accounts 

The  distribution  of  new  accounts  was  estimated  by  taking  a 
random  sample  of  two  hundred  and  fifty  from  the  population  of  new  accounts 
for  each  quarter. 

4.  The  Validation  Sample 

To  test  if  the  models  with  parameters  estimated  from  six 
hundred  and  twenty-two  accounts  could  predict  the  behavior  of  the  popu- 
lation, a  sample  comprising  one-fourth  of  the  accounts  of  Quarter  1-73 
was  taken  to  be  used  as  a  base  for  comparison.    A  chi  square  test  was 
performed  to  check  if  the  predicted  distribution  fits  the  observation. 

5 .  Summary  Statistics 

A  second  check  on  the  predictive  power  of  the  model  was 
made  by  comparing  the  total  number  of  accounts  and  total  amount  of 
savings  predicted  for  Quarters  11-72  to  11-73  against  the  summary  statistics 
for  these  quantities. 
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6  .         Total  Number  of  Accounts 

It  was  found  that  the  statistics  for  total  number  of  active 
accounts  included  those  that  had  been  closed.    It  appeared  that  these 
accounts  were  purged  from  the  records  about  once  a  year.    As  this  infor- 
mation would  be  used  as  a  check  on  the  accuracy  of  prediction  it  had  to 
be  precise,  thus,  a  page  count  of  each  quarters'  ledger  was  conducted. 
The  information  on  the  total  number  of  accounts  and  the  arrival  rate  is 
shown  in  Table  I . 

TABLE  I 

TIME  SERIES  OF  TOTAL  NUMBER  OF 
ACCOUNTS  AND  ARRIVAL  RATE 

#  OF  NEW         TOTAL  #  OF         MARGINAL 
QUARTER        ACCOUNTS         ACCOUNTS  CHANGE 


1-71 

UK 

16895 

UK 

11-71 

754 

17059 

+164 

111-71 

817 

17181 

+  122 

IV- 71 

599 

17177 

+  96 

1-72 

778 

17257 

+  80 

11-72  . 

860 

17354 

+  97 

111-72 

791 

17483 

+  129 

IV- 7  2 

798 

17752 

+269 

1-73 

998 

18013 

+261 

11-73 

896 

18087 

+  74 

Nb:  UK  -  Unknown 
7  .         Total  Amount  of  Savings 

The  trend  in  the  total  amount  of  savings  was  studied  by 
fitting  a  least  squares  line  through  the  observations.    The  data  on  total 
amount  of  savings  are  contained  in  Table  II. 
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GRAPH    OF   TOTAL    NUMBER   OP    SAVERS      VS      TIME 
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TABLE  II 


TIME  SERIES  OF  TOTAL  AMOUNT  OF  SAVINGS 


TOTAL  AMOUNT 

MARGINAL 

MEAN 

UARTER 

OF  SAVINGS  ($M) 

CHANGE  ($M) 

AMOUNT  OF 
SAVINGS  ($) 

1-71 

36.8345 

UK 

2180.20 

11-71 

3  7.5140 

0.6795 

2199.07 

111-71 

38.8286 

1.3146 

2259.97 

IV- 71 

39.5192 

0.6905 

2300.70 

1-72 

40.5565 

1.0374 

2350.15 

11-72 

41.5743 

1.0177 

2395.66 

111-72 

42.1492 

0.5749 

2410.87 

IV-72 

42.4047 

0.2555 

2388.73 

1-73 

44.1283 

1.7273 

2449.80 

11-73 

44.5614 

0.4431 

2463.73 

The  standard  deviation  of  the  amount  of  savings  in  each 
account  was  estimated  to  be  $5,314.    The  standard  error  of  the  mean  was 
estimated  to  be  $40.54.     Using  the  t  test,  any  two  means  differing  by 
more  than  $66.86  are  considered  to  be  significantly  different  at  the  ten 
percent  level  of  significance.     Thus  the  hypothesis  that  the  mean  was 
constant  over  the  period  of  observation  was  rejected.    The  average  rate 
of  increase  in  the  mean  was  found  to  be  1.1158  percent.    This  increase 
could  be  partly  accounted  for  by  earnings  accrued  in  the  accounts.     On 
the  average,  95.01  percent  of  the  quarterly  earnings  was  retained  in  the 
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GRAPH    OF   WEAN  AMOUNT   OP   SAVINGS      VS      TIME 
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institution,  thus  a  quarterly  increase  of  1.219  percent  in  the  mean  could 
be  expected  if  there  is  no  change  in  the  structure  of  the  population. 

The  following  results  were  obtained  by  fitting  the  trend  line 


to  the  total  amount  of  savings: 

(1)  Mean  of  total  savings 

(2)  Standard  deviation 

(3)  Constant  =  a 

(4)  Coefficient  =  b 

(5)  Standard  error  of  b 

(6)  Coefficient  of  determination 

(7)  Standard  error  of  dependent 
variable 


=  40.3899  million  dollars 


=  2.4153  million  dollars 


=  36.011  million  dollars 


=  0.876  million  dollars  per  quarter 


=  0.039  million  dollars  per  quarter 


=  0.986 


=  0.286  million  dollars 


During  the  period  of  observation  the  total  amount  of  savings 
was  increasing  at  a  constant  rate  of  0.876  million  dollars  per  quarter. 
The  annual  growth  rate  based  on  this  would  be  8.675%. 

It  was  found,  on  the  average,  that  9  5.01%  of  earnings  was 
left  in  the  accounts  each  quarter  and  so  the  annual  growth  rate  caused 
by  new  accounts  and  increases  in  existing  accounts  less  losses  due  to 
closing  of  accounts  and  reduction  in  levels  of  savings  would  be  8.675% 
-.9501  x  5.13%  =  3.801% 

A  second  regression  was  performed  using  the  marginal  change 
as  dependent  variable.    The  following  results  were  obtained: 
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GRAPH  OF  TOTAL  AMOUNT  OP  SAVINGS   VS   TIME 
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(1)  Mean  of  first  differences 

(2)  Standard  deviation 

(3)  Constant  =  a 

(4)  Coefficient  =  b 

(5)  Standard  error  of  b 

(6)  Coefficient  of  determination 

(7)  Standard  error  of  dependent 
variable 


0.9117  million  dollars  per  quarter 
0.4622  million  dollars  per  quarter 

0.823  million  dollars  per  quarter 

2 
0.020  million  dollars  per  quarter 

2 
0.077  million  dollars  per  quarter 

0.011 

0.460  million  dollars  per  quarter 


It  was  concluded  that  there  was  no  trend  in  the  net  change 
of  total  savings  in  each  quarter  over  the  period  of  observation. 


B 


ESTIMATION  OF  PARAMETERS 


1.         Model  I 


The  arrival  rate  can  be  estimated  by  adding  up  all  the  new 
accounts  opened  during  the  period  of  observation  and  dividing  by  the 
number  of  time  periods. 

The  distribution  of  new  accounts  can  be  estimated  by  taking 
samples  from  each  batch  of  new  accounts,  adding  up  the  accounts  entering 
each  class  and  dividing  by  the  total  number  of  accounts  in  the  sample. 
Thus: 

t 


i 

t=l 


J 


3 


t=. 


where     p.     =     maximum  likelihood  estimate  of  the  probability  of  a  new 
account  entering  the  j      class 
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t  ,  .th     , 

e       -      number  of  new  accounts  entering  the  j       class  at  time  t 
J 

r       =      number  of  accounts  in  the  sample  of  new  accounts  at 

time  t 

T      =      number  of  periods  of  observation 

The  average  number  of  accounts  entering  each  class  can  be 

found  by: 

c'    =    Ar(p   p      .    .    .  p    ) 
A    6  m 

where 

c'    =    average  number  of  new  accounts  entering  each  class  at  each 
time  period 

A 

Ar    =     Maximum  likelihood  estimate  of  the  arrival  rate 

The  stationary  transition  probabilities  can  be  estimated  by 
the  following    2    : 

T  m         T 


p.    =n../n.  .=        )        n..(t)/    Y      T      n     (t) 


T  T 

=       Y       n..(t)/      >         n.(t-l) 

f— L  1J  .^--  1 


where 


p. .        =        Maximum  likelihood  estimate  of  the  probability 

of  transition  from  class  i  to  class  j  in  any  one 

given  period 

n. .        =       Total  number  of  accounts  that  have  moved  from 
i] 

class  i  to  class  j  over  the  period  of  observation 
(0  -  T) 
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n.  =        Total  number  of  accounts  that  were  in  class  i  at 

1. 

the  beginning  of  each  period 
n..(t)       =        Number  of  accounts  that  moved  from  class  i  to 

class  j  during  the  period  between  t-1  and  t 
n.,  (t)       =        Number  of  accounts  that  moved  from  class  i  to 

class  k  during  the  period  between  t-1  and  t 

n.(t-l)       =       Total  number  of  accounts  in  class  i  at  the  time 
1 

period  (t-1) 

Anderson  and  Goodman    2    showed  that  as  n,  the  total  number 

1/2 
of  entities  in  the  system,  tends  to  infinity  the  set  (n.    )  '       (p..  -  p..) 

l.  i)         ij 

has  a  joint  normal  distribution  with  means  0,  variances  p..(l  -  p..)  and 

covariances    -    o.     p.  .p  .   where     a.     =  0  if  i  /  g  and      o,.  =  1. 
lg     ij   gh  lg  n 

This  fact  can  be  used  to  test  if  certain  transition  probabilities 

0 
p..  have  specified  values  p..  and  if  the  transition  probabilities  are  indeed 

stationary. 

2.         Statistical  Tests 

The  chi  square  test  of  goodness  of  fit  can  be  used  to  test 

hypotheses  concerning  transition  probabilities.    To  test  the  hypothesis 

that  p. .  =  p.  . ,  j  =  1,2,    .    .    .m,  the  quantity , 

m  /A  0.2 

n.     (Pit-pii'  , 

1=1  '  ^ 

under  the  null  hypothesis  has  an  asymptotic  chi  square  distribution  with 
m-1  degrees  of  freedom.    The  null  hypothesis  is  rejected  if  p. .  differs 
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from  p  .  to  such  an  extent  that  the  above  test  statistic  exceeds  the 
(1  -    cL    )  percentile  of  the  chi  square  distribution  with  m-1  degrees  of 

freedom,  where     oL     is  the  level  of  significance. 

n  9 
As  the  variables  n.    (p..  -  p..)     for  different  i  are  independent 

1.      i]         ij 

the  summation  over  i  is  distributed  as  a  chi  square  distribution  with 
m(m  -  1)  degrees  of  freedom. 

To  test  the  hypothesis  that  the  transition  probabilities  are 
stationary  over  the  period  of  observation  the  following  test  statistic  can 
be  used    2    : 


x2 


m        ?  m        m_        T 

=     Y    XZ   =     y      ">        Y     n.(t-l)  {p..(t)-p. .}   /p.. 


where 


n.(t-l)      =       total  of  entities  in  class  i  at  time  t-1 

p.  .(t)         =       estimate  of  the  transition  probability  at  time  t, 

obtained  by  counting  the  number  of  transitions  from 

class  i  to  class  j  and  dividing  by  n.(t-l) 

p..  =        estimate  of  the  transition  probability  from  class  i  to 

class  j 

T  T=J 

T      n..(t)/       y     n.(t) 

<=\     1J  tto     1 

The  asymptotic  distribution  of  this  test  statistic  is  chi  square 
with  m(m-l)  (T-1)  degrees  of  freedom.  The  number  of  degrees  of  freedom 
is  reduced  from  m(m-l)T  by  m(m-l),  the  number  of  parameters  estimated. 


45 


The  chi  square  test  is  based  on  a  statistic  which  follows  a 
chi  square  distribution  when  n,  the  total  number  of  entities  in  the  system, 
tends  to  infinity.    Hence  it  has  been  customary  of  statistics  text  books 
to  recommend  that  the  smallest  expected  number  of  entities  in  each  class 
should  exceed  five  or  ten.    If  this  requirement  is  not  met  in  the  original 
classification  then  combination  of  neighboring  classes,  until  the  rule 
is  satisfied,  is  recommended.     Cochran    4    challenged  this  arbitrary 
rule  claiming  that  the  power  of  the  test  is  reduced  by  pooling  classes  to 
conform  to  the  rule.    He  found  that  for  goodness  of  fit  tests  of  bell  shaped 
curves  such  as  the  normal  distribution  there  is  little  disturbance  to  the 
five  percent  level  when  a  single  expectation  is  as  low  as  1/2  .     He  con- 
tinued stating  that  the  result  is  also  true  for  the  one  percent  level  if  the 
number  of  degrees  of  freedom  exceeds  six  and  that  two  expectations  may 
be  as  low  as  one  may  be  allowed  with  negligible  disturbance  to  the  five 
percent  level . 

Using  Cochran's  results,  classes  with  small  expectations 
were  pooled  to  ensure  that  the  smallest  expected  number  of  entities  in 
each  class  exceeded  one  and  no  more  than  two  classes  had  expected 
numbers  less  than  two.    The  number  of  degrees  of  freedom  was  reduced 
from  m(m-l)  (T-l)  by  the  number  of  classes  eliminated. 
3.         Model  II 

The  predictor  for  arrival  rate  may  be  obtained  by  applying 
the  method  of  least  squares  to  the  number  of  new  accounts  observed  in 
each  time  period  and  the  corresponding  exogenous  variables. 
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The  distribution  of  new  accounts  is  estimated  in  each  period 
by  dividing  the  number  of  new  accounts  entering  each  class  by  the  total 
number  of  accounts  in  the  sample. 

The  transition  probabilities  p..(t)  are  estimated  by  dividing 
the  number  of  accounts  that  moved  from  class  i  to  class  j  at  time  t  by 
the  number  of  accounts  in  class  i  at  time  t-1. 

These  estimates  are  maximum  likelihood  estimates  as  in 
Model  I.     They  can  be  transformed  into  logits  and  then  regressed  against 
the  set  of  exogenous  variables . 

4  .         Estimation  of  Transition  Probabilities 

Each  of  the  six  hundred  and  twenty-two  accounts  was  cate- 
gorized in  accordance  with  the  classification  given  in  Section  A.   1.  of 
this  chapter.     The  number  of  accounts  in  each  class  for  each  quarter 
during  the  period  of  observation  is  presented  in  Table  III.     The  relative 
fraction  of  accounts,  obtained  by  dividing  the  number  of  accounts  in  each 
class  by  six  hundred  and  twenty-two,  is  shown  in  Table  IV. 

It  can  be  seen  that  twenty-seven  percent  of  the  accounts  in 
the  sample  were  closed  after  ten  quarters.     The  proportion  of  active 
accounts  in  each  class  was  found  by  dividing  the  number  of  accounts  in 
each  class  by  the  total  number  of  active  accounts.    The  results  are  pre- 
sented in  Table  V.    The  time  series  of  amount  of  savings  in  each  class 
is  presented  in  Table  VI. 

A  chi  square  test  was  performed  to  test  if  the  distribution  of 
active  accounts  had  changed  during  the  period  of  observation.     The  number 
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of  degrees  of  freedom  of  the  distribution  of  the  chi  square  statistic  is 
eighty-one  and  the  ninetieth  percentile  of  the  distribution  is  98.01.    The 
chi  square  statistic  was  found  to  be  64.2.     Thus,  the  null  hypothesis 
that  the  distribution  did  not  change  with  time  could  not  be  rejected.     This 
result  was  rather  surprising  as  it  could  imply  that  the  probability  of  an 
account  closing  did  not  depend  on  the  class  it  was  in. 

Each  account  was  examined  at  each  quarter  to  determine  if 
it  had  made  a  transition  to  another  class.    The  transitions  were  accumu- 
lated in  a  transition  count  matrix.     The  ij       element  of  this  matrix  is  the 
number  of  transitions  from  the  i      class  to  the  j       class  in  a  given  quarter. 
An  example  of  a  transition  count  matrix  is  shown  in  Table  VII.    The 
transition  count  matrices  for  other  quarters  are  contained  in  Appendix  B. 

The  estimate  of  each  quarter's  transition  matrix  was  obtained 
by  the  method  described  earlier  in  this  section.    An  example  of  the  estimate 
of  the  transition  matrix  of  Quarter  II   71  is  shown  in  Table  VIII.     The 
estimates  for  subsequent  quarters  are  contained  in  Appendix  C. 

A  cumulative  transition  count  matrix  was  formed  by  adding 
successive  transition  count  matrices.     Thus  the  cumulative  transition 
count  matrix  of  Quarter  1-72  is  the  sum  of  the  transition  count  matrices 
of  Quarters  11-71,  111-71,  IV- 71  and  1-72.    The  cumulative  transition 
count  matrices  are  contained  in  Appendix  D. 

The  time  stationary  estimate  of  the  transition  matrix  was 
obtained  by  dividing  each  element  of  the  cumulative  transition  count 
matrix  by  its  row  sum.     For  the  sake  of  brevity  the  estimate  of  transition 


52 


2: 

o 

vO 

CM 

cm 

m 

vO 

ou 

CM 

—i 

m 

in 

r\j 

z> 

o 

■— 1 

in 

-*• 

cm 

CO 

>T 

ro 

ro 

CM 

cm 

<•> 

r-4 

—* 

O 

-i         o 


— *  O  —> 


o        o 


-O 


CD  O 

— <  C\J 


CM 


UJ 


xooooooo 


m        o 


C\J 


00 
CM 


a: 

< 


XOOOOOOO0000C\J-JO 


cm 


Cl£ 

LU 

I— 
a: 
< 

Or 

z 

UJ 


m        ro 


CM 

cm 


cm 


co        ■— i        —i        _i 


CO 


w 


»- 

UJ 
CO 

X 

►—I 
OH 

V~ 
< 


> 

o 

z 

UJ 

a 

UJ 

cc 


•-•o-^oocM<fmo 


-j        m 


m 


in 


•$•        -» 


CM 


-i        o 


cm 


o        <-< 


CM 


O 

in 


O 


i/) 


•"HOcoC^cocm-jco— 4000cm 


< 
a: 


o        cm 
in 


(M 


m 


CM 


m 


(M 

GO 


•-.  o 


00 


— •        o 


CM 


CM 


> 


i-«  i-h  X  •-'3 

>  >  »-i  X  X  <S) 


53 


X 


w 
rA 


cm 
cc 


cc 

<t 

O 

o 

z 
< 


cc 

LLI 
CC 

< 
o 

z 

LLI 

OJ 

3 


CO 

X 

►— « 
CC 

\- 
< 


00 

< 
c£ 

I- 

LL 

o 


< 


LU 


o 

r\i 

<t- 

o 

o 

0> 

— 1 

o 

o 

i-4 

r«- 

CM 

o 

o 

o 

o 

o 

o 

o 

o 

o 

— i 

r- 

o 

o 

o 

o 

o 

o 

o 

o 

— 1 

o 

o 

r- 
in 

CO 

o 

o 
o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

n0 

O 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

o 

o 

o 

o 

o 

o 

o 

O 

in 
o 

o 

NO 

o 

CO 

o 

1— 1 

in 

o 

o 
o 

o 

o 

o 

o 

o 

o 

o 

f— 1 

m 

o 

o 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o^ 

r\j 

in 

co 

o 

—j 

r- 

0^ 

0* 

ro 

o 

h- 

p-i 

—J 

r- 

CM 

CM 

in 

o 

o 

o 

o 

o 

o 

o 

in 

i— l 

o 

o 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

r- 

«fr 

ro 

>T 

CM 

nO 

o 

o 

in 

r- 

•— 1 

CM 

CO 

o 

o 

—4 

CM 

r» 

CO 

CM 

-r 

o 

o 

o 

o 

o 

r-t 

r- 

o 

O 

O 

o 

o 

o 

o 

o 

o 

O 

o 

o 

O 

O 

o 

o 

>t 

in 

CO 

ro 

o 

o 

>T 

CO 

o 

CM 

o 

o 

>fr 

CO 

in 

CO 

-j- 

o 

o 

o 

o 

CJ 

in 

o 

o 

o 

c_> 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

O 

o 

o 

CO 

CM 

ro 

CO 

<*• 

CO 

ro 

o 

o 

o 

O 

co 

co 

ft 

m 

CM 

o 

o 

CM 

o> 

co 

m 

ft 

CM 

ro 

«t 

o 

o 

o 

o 

r- 

ft 

o 

o 

O 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

O 

o 

o 

ft 

in 

o> 

0> 

vj" 

in 

o 

vO 

•— i 

CO 

s0 

ft 

>t 

o 

— • 

— < 

co 

r~ 

ft 

<o 

>*■ 

o 

o 

—J 

r- 

o 

<_> 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

O 

o 

o 

o 

o 

o 

—4 

•* 

r» 

<f 

in 

i-4 

CO 

CO 

m 

r- 

** 

CO 

* 

ro 

■— 4 

o 

in 

<f 

m 

m 

CM 

o 

o 

r- 

o 

o 

o 

o 

O 

o 

o 

o 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

r- 

o 

o> 

cr 

co 

-o 

CO 

m 

in 

o 

o 

^o 

f*- 

-0 

t— i 

CM 

r» 

r- 

in 

>t 

0> 

o 

o 

— i 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

O 

o 

o 

o 

o 

o 

o 

o 

o 

o 

CM 

0> 

cm 

<\l 

r- 

o 

ro 

o 

o 

CO 

CO 

0> 

cm 

CM 

r- 

CM 

o 

o 

>f 

o 

-rf 

CM 

CM 

>r 

ro 

>T 

o 

o 

o 

o 

O 

o 

CJ 

o 

O 

o 

o 

ft 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

I— <  •— I  > 

M  M  w  > 


X 

f>  X 


54 


matrices  was  termed  CPM  Z  where  Z  was  a  Roman  numeral  indicating  that 
the  data  used  in  the  estimation  came  from  the  first  Z  quarter  of  the  period 
of  observation.     Thus  CPM  V  stands  for  the  estimate  of  the  stationary 
transition  matrix  using  data  from  Quarter  1-71  to  Quarter  1-72.     CPM  II 
through  CPM  X  are  contained  in  Appendix  E. 

5 .         Test  of  Time  Stationary  Assumption 

It  can  be  seen   from  the  transition  count  matrices  that  there 
are  a  large  number  of  elements  with  zero  or  one  transition  counts.    The 
chi  square  test  could  not,  therefore,  be  applied  directly.     The  classes 
of  each  row  were  combined  so  that  the  smallest  class  had  an  expectation 
exceeding  one  count  and  no  more  than  two  classes  had  expectation  of 
less  than  two  counts.     The  following  grouping  was  obtained: 


Class 

I 

II 

Ill 

IV 

V 

VI 

VII         VIII 

IX 

X 

XI 

II 

.046 

.883 

.054 

- 

- 

- 

- 

- 

.017 

III 

.023 

.110 

.733 

.104 

.015 

- 

- 

- 

.015 

IV 

.040 

.040 

.075 

.711 

.089 

- 

- 

- 

.046 

V 

.050 

- 

.130 

- 

.672 

.104 

- 

- 

.046 

VI 

.081 

- 

.147 

- 

- 

.536 

- 

- 

.237 

VII 

.044 

.087 

- 

- 

- 

- 

.760     .084 

- 

.026 

VIII 

.064 

- 

- 

.109 

- 

- 

.611 

.169    - 

.049 

IX 

.075 

- 

- 

- 

.083 

- 

.083 

.636    - 

.123 

X 

.067 

- 

- 

.102 

- 

- 

- 

- 

712 

.120 

XI 

.042 

- 

_ 

_ 

.09  7 

- 

- 

- 

.861 

The  number  of  degrees  of  freedom  for  the  above  matrix  is 
equal  to  the  number  of  elements  minus  the  number  of  linear  constraints, 
(47-10).    As  the  number  of  matrices  is  nine  and  the  number  of  parameters 
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estimated  in  (47-10)  the  number  of  degrees  of  freedom  for  the  distribution 
of  the  chi  square  statistic  for  the  test  of  stationary  transition  probability- 
matrix  is  (47-10)  (9-1)  =  296. 

The  rejection  region  for  10%  level  of  significance  is  328.6. 
The  chi  square  statistic  was  found  to  be  288.7  thus  the  null  hypothesis 
that  the  transition  probabilities  were  stationary  could  not  be  rejected. 
6.         The  Initial  Distribution  of  the  Population 

The  initial  distribution  of  the  population  was  determined  by 
recording  all  accounts  with  balance  exceeding  one  thousand  dollars  on 
31  March  19  72.     The  number  of  accounts  below  one  thousand  dollars 
was  found  by  taking  the  difference  between  the  total  number  of  accounts 
and  the  number  of  accounts  recorded.    The  mean  and  variance  of  the  amount 
of  savings  in  an  account  in  each  class  were  estimated  from  this  sample. 
Table  IX  is  a  summary  of  the  data  obtained. 

It  can  be  seen  that  the  estimate  of  the  mean  of  each  class, 
except  for  Classes  II  and  XI  is  close  to  the  midpoint  of  the  respective 
class  intervals.    All- the  means  are  below  the  midpoints  as  there  are  more 
accounts  at  the  lower  end  of  each  class.    The  estimates  of  variance  of 
Classes  II  to  IX  are  very  close  because  the  class  intervals  are  the  same 
and  the  distribution  of  accounts  in  each  class  has  the  same  general 
shape.    The  estimates  of  variance  for  Classes  X  and  XI  show  the  importance 
of  length  of  class  interval  on  predictions  of  total  amount  of  savings. 
The  variance  of  the  amount  of  savings  of  accounts  in  Classes  X  and  XI 
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TABLE  IX 

SIZE  DISTRIBUTION  OF  THE  ENTIRE  POPULATION 
OF  ACCOUNTS  AT  QUARTER  1-72 


CLASS 

INTERVAL  ($) 

NUMBER  OF 

MEAN  ($) 

VARIANCE 

I 

0 

0 

0 

0 

II 

1  - 

1999 

12373 

353 

246544 

III 

2000  - 

3999 

1793 

2837 

310372 

IV 

4000  - 

5999 

1034 

4916 

317481 

V 

6000  - 

7999 

563 

6855 

328649 

VI 

8000  - 

9999 

366 

8905 

346948 

VII 

10000  - 

11999 

372 

10757 

362291 

VIII 

12000  - 

13999 

209 

12920 

329649 

IX 

14000  - 

15999 

153 

14961 

314260 

X 

16000  - 

19999 

183 

17791 

1355376 

XI 

20000  - 

99999 

205 

27888 

110502144 

; 

100000 

6 

156558 

2.983  x  109 

can  be  reduced  by  the  introduction  of  more  classes  to  cover  the  same 
interval.    However,  this  could  lead  to  classes  having  smaller  populations 
which  may  not  possess  the  Markovian  property. 

This  paper  took  the  compromise  in  selecting  class  intervals 
such  that  each  class  had  a  minimum  of  one  hundred  and  fifty  accounts. 
The  six  accounts  that  exceeded  $100,000  were  considered  to  be  unchanged 
during  the  period  of  observation.     These  accounts  added  up  to  $0.94 
million.     Thus  the  predicted  amount  of  total  savings  could  differ  by  one 
million  dollars  because  of  the  action  of  a  handful  of  savers. 
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7 .         The  Size  Distribution  of  New  Accounts 

Each  new  account  of  the  samples  of  new  accounts  was  clas- 
sified according  to  the  rule  given  in  Section  A.    1 .  of  this  chapter.     The 
number  of  new  accounts  in  each  class  for  Quarter  11-71  through  Quarter 
11-73  is  shown  in  Table  XI. 

The  maximum  likelihood  estimate  of  the  probability  of  a  new 
account  entering  each  class  was  obtained  by  dividing  the  number  of  new 
accounts  in  each  class  by  the  total  number  of  new  accounts.     The  quarterly 
estimates  of  the  probability  of  a  new  account  entering  each  class  and  the 
time  stationary  estimates  are  presented  in  Table  XII. 

A  chi  square  test  was  performed  to  test  the  hypothesis  that 
the  probabilities  were  time  stationary.     The  number  of  degrees  of  freedom 
of  the  distribution  of  the  chi  square  statistic  was  seventy-two  and  the 
ninetieth  percentile  of  the  distribution  is  87.84.     The  chi  square  statistic 
obtained  was  68.8.    Thus  the  null  hypothesis  could  not  be  rejected  at 
the  ten  percent  level  of  significance. 

As  a  further  check  a  one  way  analysis  of  variance  was  per- 
formed.   The  results  are  as  follows: 

Total  number  of  observations        =  22  50 

Average  of  all  observations  =  2535.38 

Standard  error  within  groups         =  8732.41 

Degrees  of  freedom  =  2241 

Standard  error  between  groups     =         11488.08 

Degrees  of  freedom  =  8 
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F  statistic  =  1.73 

Level  of  significance  =  0.0865 

Thus  the  null  hypothesis  that  the  mean  amount  of  savings 
of  new  accounts  is  constant  over  the  period  of  observation  is  rejected 
at  the  10%  level  of  significance. 

The  mean  and  standard  deviation  of  the  amount  of  savings  of 
the  samples  of  new  accounts  are  as  follows: 

TABLE  X 

MEAN,  STANDARD  DEVIATION,  MEDIAN,  MAXIMUM 
VALUE  AND  MINIMUM  VALUE  OF  SAMPLES  OF  NEW  ACCOUNTS 


Quarter 

Mean 

Standard 

Median 

Maximum 

Minimum 

($) 

Deviation 

Value 

Value 

11-71 

1671.34 

3615.79 

279.5 

25000. 

1. 

III- 71 

1960.13 

5038.32 

301.5 

52518. 

1. 

IV- 71 

2500.38 

6561.85 

300.0 

50000. 

1. 

1-72 

2169.10 

5553.17 

224.5 

40000. 

2. 

11-72 

3193.56 

8641.02 

340.50 

103157. 

1. 

111-72 

2812.04 

8264.18 

282.50 

100032. 

1. 

IV-72 

2271.53 

7642.48 

146.50 

100000. 

1. 

1-73 

4054.80 

18161.52 

238.5 

200000. 

1. 

11-73 

2185.53 

6536.75 

101.5 

50000. 

2. 

Nb.     sample  size  =  250 

The  Duncan's  Multiple  Range  Test  showed  that  the  means  of 
Quarters  11-71,  111-71,  IV-71,  1-72,  IV-72  and  11-73  are  significantly 
different  from  that  of  Quarter  1-73  at  the  ten  percent  level  of  significance. 
The  means  of  Quarters  11-71  and  11-72  are  also  significantly  different 
at  the  ten  percent  level  of  significance.    The  differences  between  the 
means  of  other  quarters  were  not  considered  significant. 
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The  means  are  greatly  influenced  by  the  large  accounts. 
The  mean  of  Quarter  1-73  would  drop  to  $3267.87  if  the  $200000  account 
were  deleted  from  the  sample.     This  reduced  mean  will  be  significantly 
different  from  that  of  Quarter  11-71  only. 

Deleting  accounts  that  were  greater  than  $100000  from  the 
samples  reduced  the  means  of  Quarters  11-72,  111-72,  IV- 7 2  and  J.-73  to 
2792.10,  2421.59,   1879 .04  and  2292 . 24  respectively .     The  maximum 
difference  between  the  means  is  1120.76  which  is  considered  insignificant 
at  the  ten  percent  level  of  significance. 

8.         Predictors  of  Transition  Probabilities 

The  corresponding  estimates  of  transition  probabilities  of 
each  quarter  were  grouped  together,  transformed  into  logits  and  regressed 
against  the  following  set  of  exogenous  variables: 

X  =        Dummy  variable  for  quarters  of  the  year 

X?         =        California  non-agricultural  employment 

X~         =       Advertising  and  promotional  expense  of  the  savings 
institution 

X.         =       Prime  commercial  paper  rate,  4-6  months 

X  =        U.  S.  Government  securities  rate,   6  months 

o 

X  =        Corporation  bonds  rate 

b 

X  =       Wholesale  price  index  lagged  by  one  period 

X0        =       U.  S.  Government  securities  rate,  3  months 
o 

XD         =        California  personal  income 
X  =        U.  S.  total  credit 

The  values  of  these  variables  are  contained  in  Table  XIV.. 
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There  was  some  difficulty  in  transforming  the  transition 
probabilities  as  a  number  of  them  was    equal  to  zero  and  the  logit  of 
zero  is  minus  infinity.     The  following  rule  was  used  to  get  around  this 
problem: 

1 .  If  there  are  more  than  two  estimates  for  p.  .(t) ,  t  =  11-71 , 
111-71,    .    .    .  1-73,  equal  to  zero,  assume  that  p..(t)  is 
constant  over  the  period  of  observation  and  use  the  time 
stationary  estimate  obtained  for  Model  I.     No  regression 
will  be  performed  for  these  elements. 

2.  If  there  are  one  or  two  zeros  in  the  estimates,  replace  the 
zeros  by  the  time  stationary  estimate  and  proceed  with  logit 
transformation  and  regression. 

The  number  of  transition  probabilities  removed  by  these  rules 
was  seventy-two.    As  there  were  one  hundred  and  ten  elements  in  the 
transition  matrix  that  required  estimation,  application  of  these  rules  left 
a  balance  of  thirty-eight  elements  for  regression. 

The  transition  matrix  for  Quarter  11-73  was  not  included  in 

the  regression  in  order  that  it  could  be  used  to  test  the  correctness  of 

the  predictors  obtained  with  data  from  earlier  periods.     Thus,  there  were 

eight  data  points  in  the  regressions  instead  of  nine. 

In  the  first  regressions  performed,  it  was  found  that  X    , 

o 

U.  S.  Government  securities  rate,  3  months,  X   ,  California  personal 
income  and  X  n#  U.  S.  total  credit  were  highly  correlated  with  each  other 


65 


and  some  of  the  other  exogenous  variables  (R  "^  .98).  To  reduce  the 
problem  of  multicollinearity ,  these  three  variables  were  dropped  from 
the  regression  equations. 

The  following  criteria  were  used  to  determine  if  the  variance 
of  the  logits  of  transition  could  be  explained  by  the  exogenous  variables: 

1.  The  F  statistic  obtained  by  the  ratio  of  the  estimate  of  the 
variance  before  and  after  the  introduction  of  an  independent 
variable  must  exceed  2.06,  the  eightieth  percentile  of  the 

F(7,6)  distribution. 

2 

2.  The  coefficient  of  determination,  R    ,  must  exceed  0.70. 

Of  the  thirty-eight  regressions  only  ten  were  found  to  be 
significant  according  to  these  criteria.    As  each  row  of  the  transition 
matrix  would  be  divided  by  the  sum  of  its  elements  these  ten  elements 
could  cause  significant  changes  to  the  transition  matrix. 

The  predictors  for  the  ten  logits  of  transition,  obtained  by 
regression,  are  as  follows: 


L2  1 

— 

-1.919 

+ 

0.085X 
(0.049) 

- 

0.300X 
(0.085) 

L3  2 

= 

9.386 

- 

0.350X 
(0.150)5 

- 

8.488X 
(3.375) 

L4  4 

= 

-0.374 

+ 

0.086X 
(0.045) 

+ 

0.217X 
(0.078) 

L5  5 

= 

2.627 

- 

0.402X 
(0.107)  b 

L5  6 

= 

-11.091 

+ 

0.223X 
(0.066) 

+ 

6. 62  OX 
(3.653) 

hi 

= 

-3.001 

- 

0.49  8X 
(0.171) 

- 

0.257X    + 
(0.119) 

0.361X 
(0.220) 

hi 

= 

9.163 

+ 

0.136X 
(0.02  0) 

- 

0.734X    - 
(0.233) 

2.2  64X 
(1.535) 
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L                 =        -5.784         +        0.143X  +        0.140X  +        0.130X 

(0.073)  (0.052)  (0.094) 

L       =    7.557    -   0.382X  0.083X  7.066X 

(0.080)  *  (0.056)  (2.275) 

L  =        -6.469         +        0.758X 

11  iU  (0.087)  b 

These  logits  were  then  transformed  back  into  probabilities  by 

taking  the  anti-logarithms  and  dividing  by  one  plus  the  anti-logarithms 

of  the  logits.     Thus, 

p..      =     exp(L..)/(l  +  exp(L..)) 

The  frequency  of  appearance  of  each  exogenous  variable  is 
as  follows: 

VARIABLE  FREQUENCY 

1  6 

2  0 

3  4 

4  2 

5  5 

6  0 

7  4 

The  estimates  of  transition  probabilities  that  were  found  to 
vary  significantly  with  the  set  of  exogenous  variables  appeared  to  have 
a  seasonal  effect  as  the  dummy  variable  appeared  most  frequently  in  the 
regressions . 

An  increase  in  X    ,   U.  S.  Government  securities  rate,  would 
result  in  an  increase  in  the  probability  of  an  account  to  move  from 


Nb.    The  number  in  brackets  below  each  regression  coefficient  is  the 
standard  error  of  the  coefficient. 
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Class  XI  to  Class  X.    A  possible  explanation  is  that  savers  in  Class  XI 
will  reduce  their  passbook  account  savings  and  invest  in  U.  S.  Government 
securities  when  the  securities  rate  increases.     However,  a  consistent  set 
of  explanations  could  not  be  given  for  the  ten  predictors  so  a  non-casual 
approach  had  to  be  followed. 

The  transition  probabilities  without  predictors  were  considered 
to  be  stationary  during  the  period  of  observation.     Thus  the  nonstationary 
matrix  was  formed  by  replacing  ten  elements  of  the  estimate  of  the  stationary 
matrix  with  predicted  values.    To  ensure  that  each  row  add  up  to  one,  each 
element  was  divided  by  the  rwo  sum.     Selected  transition  matrices  used 
in  Model  II  are  contained  in  Appendix  F. 

A  chi  square  test  was  performed  to  test  if  the  predictors  could 
predict  the  transition  matrix  for  Quarter  11-73.    The  predicted  matrix  was 
formed  by  replacing  ten  elements  of  the  Quarter  1-73  cumulative  matrix 
with  values  obtained  with  the  predictors  and  normalizing  each  row.     The 
problem  of  small  expected  number  of  transitions  in  certain  elements  of  the 
matrix  was  resolved  by  combininb  classes  of  each  row  in  the  manner 
described  in  Section  A.   5.     The  ninetieth  percentile  for  the  chi  square 
distribution  with  37  degrees  of  freedom  is  48.84.     The  chi  square  statistic 
obtained  in  the  test  was  35.25,  thus,  the  null  hypothesis,  that  the  pre- 
dicted matrix  and  the  observed  matrix  of  Quarter  11-73  were  the  same, 
could  not  be  rejected. 

9  .*        Predictors  of  Arrival  Rate 

The  number  of  new  accounts  opened  in  each  quarter  was 
regressed  against  the  same  set  of  exogenous  variables  listed  in  sub-section 
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8.         The  predictor  of  arrival  rate,  measured  in  thousands  per  quarter, 

was  found  to  be  as  follows: 

Ar=  0.052  -  0.073X   +    0.094X 
(0.017)         (0.029) 

The  standard  error  of  each  coefficient  is  contained  in  the 

bracket  below  each  coefficient.     The  square  of  the  multiple  correlation 

between  the  arrival  rate  and  the  exogenous  variables,  X    and  X    ,  was 

J.  0 

0.846.     The  standard  error  of  Ar  before  and  after  the  regression  was 
0.7887  and  0.045. 

According  to  this  predictor,  the  number  of  new  accounts 
opened  per  quarter  decreases  as  the  year  progresses,  as  X    ,  the  dummy- 
variable  for  quarters,  takes  on  values  1,2,3  and  4  for  the  four  quarters 
of  the  year.     The  number  of  new  accounts  opened  v/ould  also  increase  as 
the  U.  S.  Government  securities  rate  increases.     No  apparent  reasons 
could  be  found  for  this  relationship.     Predictions  are  compared  with 
observations  in  the  following  table. 

TABLE  XV 
PREDICTED  ARRIVAL  RATE  AND  ACTUAL  RATE  OBSERVED 


QUARTER 

PREDICTION 

OBSERVATION 

11-72 

777 

860 

III- 72 

751 

791 

IV- 72 

719 

798 

1-73 

1015 

998 

11-73 

1017 

896 
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10.         Predictors  of  the  Probabilities  of  a  New  Account  Entering 
Each  Class 

The  estimates  of  the  probability  of  a  new  account  entering 
each  class  obtained  for  Quarter  11-71  through  Quarter  11-73  were  collected 
together.     They  were  transformed  into  logits  and  regressed  against  the 
set  of  exogenous  variables  listed  in  sub-section  8.     Using  the  criteria 
given  in  sub-section  8  to  determine  if  the  exogenous  variables  in  a  re- 
gression could  explain  the  variance  of  the  logits,  only  four    predictors 
were  accepted.    They  are: 


L4    = 

2.217 

-  0.082X 
(0.029) 

-  0.466X 
(0.180) 

L7    " 

6.187 

-  0.089X 
(0.029) 

-  0.979X 
(0.246) 

L9    = 

-4.482 

-  0.184X 
(0.049) 

+  3.053X 
(1.073) 

L!0   = 

-10.725 

+  0.184X 
(0.094) 5 

+  0.99  8X 
(0.332) 

The  standard  error  of  each  coefficient  is  contained  in  the  bracket  below 
each  coefficient. 

The  logits  are  transformed  back  to  estimates  of  probabilities 
by: 

10(V/(1.0  +  10(Li}) 

Logarithms  to  the  base  of  10  were  used  in  both  the  forward 
transformation  and  the  inverse  transformation.     The  base  of  the  logarithm 
does  not  affect  the  results  of  the  regressions. 

Predictions  of  the  number  of  new  accounts  in  each  class 
were  checked  by  means  of  the  chi  square  test.     The  number  of  degrees 
of  freedom  of  the  distribution  was  thirty  and  the  ninetieth  percentile  of 
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the  distribution  is  40.26.    The  chi  square  statistic  obtained  was  36.87. 
Thus,  the  hypothesis  that  the  predicted  distributions  matched  the  obser- 
vations could  not  be  rejected. 

The  predicted  arrival  distributions  for  Quarters  11-72  to  11-73 
are  contained  in  Appendix  G. 
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IV.     MODEL  VALIDATION 

A.         VALIDATION  OF  MODEL  I 

1 .         Prediction  of  Sample  Population  Behavior 

As  there  were  no  entries  into  the  sample  population  changes 
to  the  structure  were  caused  by  accounts  moving  between  classes  and  by 
accounts  closing.     Thus  the  basic  Markov  chain  model  could  be  used  to 
model  the  behavior  of  this  population. 

It  was  decided  to  use  the  data  from  the  five  quarters,  Quarter 
1-71  through  Quarter  1-72,  to  estimate  the  time  stationary  transition  matrix 
and  then  use  the  matrix  to  predict  the  structure  of  the  sample  population 
for  Quarter  11-72  through  Quarter  11-73.     Predictions  could  then  be  com- 
pared against  observations  and  the  chi-square  test  be  used  to  determine 
the  goodness  of  fit. 

CPM  V,  the  estimate  of  the  time  stationary  transition  matrix 
with  the  first  five  quarters'  data,  was  used  to  predict  the  number  of  accounts 
in  each  class  and  the  amount  of  savings  in  each  class.    The  results  of  the 
predictions  on  the  number  of  accounts  is  contained  in  Table  XVI  .      The 
actual  number  observed  and  the  chi-square  statistic  for  each  class  are 
presented  next  to  the  predictions. 

The  predictions  were  expected  to  diverge  more  and  more  from 
observations  as  time  progressed  as  errors  would  accumulate.    The  chi- 
square  statistic  for  the  first  prediction  was  3.49  and  the  value  for  the 
fifth  prediction  was  11.91.    These  correspond  to  the  fourth  percentile  and 
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the  seventieth  percentile  of  the  chi- square  distribution  with  ten  degrees 
of  freedom.     The  predicted  distribution  after  five  quarters  still  provided 
a  reasonably  good  fit  to  the  observations. 

The  predicted  amount  of  savings  in  each  class  and  the  actual 
amount  observed  are  presented  in  Table  XVII.      The  predictions  did  not 
match  the  observations  as  well  as  the  predictions  of  number  of  accounts. 
The  error  in  prediction  of  total  amount  of  savings  amounted  to  10.6  percent 
after  five  quarters.    The  difference  between  predicted  total  amount  of 
savings  and  the  amount  observed  could  be  explained  by  the  fact  that  the 
predicted  number  of  accounts  for  the  larger  classes,  class  VII  to  class  XI, 
were  generally  smaller  than  the  number  observed.     The  error  in  the  number 
of  accounts,  though  relatively  insignificant  in  absolute  magnitude,  when 
multiplied  by  the  average  amount  of  savings  would  amount  to  a  substantial 
sum.     Thus  the  estimates  of  transition  probabilities  between  classes  with 
low  average  amount  of  savings  per  account  and  those  with  high  average 
amount  of  savings  per  account  would  have  to  be  precise  to  yield  more 
accurate  predictions  of  total  amount  of  savings. 

A  relatively  small  number  of  large  accounts  can  increase  the 
variability  of  total  amount  of  savings  significantly.     The  error  in  prediction 
for  Quarter  11-73  amounted  to  about  four  hundred  and  fifty   six  thousand 
dollars.    Of  this  amount  four  hundred  and  forty  two  thousand  dollars  were 
contributed  by  twenty  two  accounts  in  classes  VIII,  IX,  X  and  XI.     It 
would  seem  to  appear  that  there  is  no  easy  way  to  reduce  the  variability 
in  total  amount  of  savings  caused  by  this  small  group  of  savers. 
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If  the  time  stationary  assumption  is  not  violated  then  it  is 
legitimate  to  estimate  the  transition  matrix  with  data  from  the  entire  period 
of  observation.     The  increase  in  data  should  yield  better  estimates  of 
transition  probabilities.     Thus  CPM  X,  the  transition  matrix  estimated 
with  all  ten  quarters'  data,  was  used  in  predicting  the  number  of  accounts 
and  the  amount  of  savings  in  each  class.     The  results  are  presented  in 
Appendix  H. 

i 

To  demonstrate  the  importance  of  data  on  predictions,  CPM  II, 
the  transition  matrix  estimated  with  data  from  Quarter  1-71  and  Quarter 
11-71,  was  also  used  to  predict  the  number  of  accounts  and  the  amount  of 
savings  in  each  class.     The  results  are  also  presented  in  Appendix  H. 

The  chi-square  statistics  obtained  using  CPM  V,  CPM  II 
and  CPM  X  are  compared  in  the  following  table: 

TABLE  XVIII 

COMPARISON  OF  CHI  SQUARE  STATISTICS  OBTAINED 
WITH  CPM  V,  CPM  II  AND  CPM  X 


MATRIX 

11-72 

111-72 

QUARTER 
IV- 72 

1-73 

11-73 

CPM  V 

3.49 

2.45 

11.05 

10.74 

11.91 

CPM  II 

7.59 

13.84 

35.67 

51.11 

65.97 

CPM  X 

3.26 

1.12 

5.93 

5.03 

3.57 

The  tenth  percentile  and  the  ninetieth  percentile  of  the  chi 
square  distribution  with  ten  degrees  of  freedom  are  as  follows: 

.       pio  =    4-87 

Pgo    =    15.99 
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Using  P       as  a  criterion  to  determine  if  the  fit  is  acceptable 
9  0 

it  could  be  seen  that  predictions  with  CPM  V  and  CPM  X  passed  the  test 
for  the  entire  period  of  prediction  whereas  predictions  with  CPM  II  were 
only  acceptable  for  the  first  two  periods. 

The  total  amount  of  savings  predicted  using  CPM  V,  CPM  II 
and  CPM  X  are  compared  in  the  following  table: 

TABLE  XIX 

COMPARISON  OF  TOTAL  AMOUNT  OF  SAVINGS 
OBTAINED  WITH  CPM  V,  CPM  II  AND  CPM  X  ($M) 


QUARTER 

MATRIX 

11-72 

111-72 

IV- 72 

1-73 

11-73 

CPM  V 

3.672 

3.509 

3.351 

3.201 

3.057 

CPM  II 

3.553 

3.293 

3.060 

2.851 

2.664 

CPM  X 

3.727 

3.613 

3.500 

3.389 

3.280 

ACTUAL 

3.627 

3.535 

3.509 

3.404 

3.418 

The  superiority  of  predictions  with  CPM  X  is  apparent.    The 
percentage  error  in  predicting  the  total  amount  of  savings  of  Quarter  11-73 
is  4.0  which  is  less  than  half  of  that  obtained  using  CPM  V.     The  importance 
of  accurate  estimates  of  transition  probabilities  is  clearly  demonstrated 
by  the  above  comparisons. 

2  .         Prediction  of  Behavior  of  Population 

To  predict  the  behavior  of  the  entire  population  the  model  has 
to  include  the  process  of  arrivals  and  entrants.    As  the  sample  size  was 
small  (about  3.5%  of  the  population)  it  was  decided  to  use  the  entire  data 
base  to  estimate  the  transition  matrix. 
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The  average  arrival  rate  (number  of  new  accounts  opened  per 
quarter)  was  found  to  be  800.7  and  the  distribution  of  new  accounts  was 
estimated  to  be  as  follows: 


CLASS 

A 

II 

0.7813 

III 

0.0680 

IV 

0.0484 

V 

0.0187 

VI 

0.0124 

VII 

0.0156 

VIII 

0.0089 

IX 

0.0111 

X 

0.0076 

XI 

0.0280 

The  estimates  were  obtained  by  adding  up  the  number  of  new  accounts  in 
each  class  over  the  period  of  observation  and  dividing  by  the  total  number 
of  new  accounts  sampled. 

The  number  of  accounts  in  each  class  was  predicted  by 
adding  the  expected  number  of  accounts  moving  into  or  remaining  in  that 
class  from  the  population  of  accounts  already  in  the  system  and  the 
number  of  new  accounts  entering  that  class.     The  expression  used  in  the 
computation  can  be  found  in  Section  C  of  Chapter  II. 

The  predicted  total  number  of  accounts  and  the  total  amount 
of  savings  are  shown  in  the  following  table: 
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TABLE  XX 

PREDICTED  TOTAL  NUMBER  OF  ACCOUNTS  AND 
TOTAL  AMOUNT  OF  SAVINGS  AND  OBSERVED  VALUES 

QUARTER         11-72  111-72  IV- 72  1-73  11-73 


TOTAL  # 

PRED. 

17345 

17447 

17557 

17664 

17776 

OF 

ACCOUNTS 

ACT. 

17354 

17483 

17485 

17746 

17820 

TOTAL  PRED.        45.65  49.87  53.78  57.39  60.74 

AMOUNT  OF 

SAVINGS  ACT.  41.57  42.15  42.40  44.13  44.56 

The  maximum  error  in  predicting  the  total  number  of  accounts 
was  82  which  was  about  half  a  percent  of  the  total  number  of  accounts. 
This  indicated  that  the  process  of  arrivals  and  the  process  of  departures 
were  probably  as  described  by  the  model  during  the  period  of  prediction. 

The  failure  of  the  model  to  predict  the  total  amount  of  savings 
could  be  due  to  the  failure  of  the  model  to  predict  the  structure  of  the  popu- 
lation or  a  violation  of  the  constant  average  amount  of  savings  in  each 
class  assumption. 

To  test  the  hypothesis  that  the  error  in  total  amount  of  savings 
was  caused  by  error  in  predicting  the  number  of  accounts  in  each  class,  a 
sample  comprising  one-fourth  of  the  population  at  Quarter  1-73  was  taken 
and  used  to  compare  with  the  predicted  structure  of  active  accounts.    The 
chi  square  test  was  used  to  determine  the  goodness  of  fit  between  the 
predicted  distribution  and  the  distribution  of  the  sample. 

The  number  of  degrees  of  freedom  of  the  distribution  of  the 
chi  square  statistic  is  eight  and  the  ninetieth  percentile  of  the  distribution 
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is  13.36.     The  chi  square  statistic  obtained  was  111.0,  thus,  the  null 
hypothesis  that  the  predicted  distribution  and  the  distribution  of  the  sample 
could  be  rejected. 

In  examining  the  chi  square  statistic  of  each  class  it  was 
found  that  major  sources  of  error  came  from  Classes  II  and  III,  IV,  V.,  VII 
and  XI  (Classes  II  and  III  had  been  combined  to  ease  the  burden  of  ex- 
tracting data  for  the  validation  sample).    It  appeared  that  Classes  IV,  V, 
VII  and  IX  became  much  larger  at  the  expense  of  Classes  II  and  III.     This 
would  account  for  the  high  predictions  of  total  amount  of  savings. 

Another  check  was  made  by  taking  the  difference  between  the 
predicted  number  of  accounts  in  the  sample  and  the  actual  number  of 
accounts  in  each  class  and  multiplying  by  the  respective  average  amount 
of  savings  of  each  class.     The  errors  in  the  amount  of  savings  in  each 
class  are  shown  in  Table  XXI. 

If  the  validation  sample  could  be  taken  as  a  good  represen- 
tation of  the  population  then  the  error  in  prediction  of  the  population  could 
be  estimated  by  multiplying  the  error  in  the  amount  of  savings  in  the  vali- 
dation sample  by  four.     Thus,  the  prediction  of  total  amount  of  savings 
would  be  high  by  $11.2  million .     The  observed  error  of  $13.3  million 
could  therefore  be  considered  to  be  mainly  the  result  in  errors  in  predicting 
the  structure  of  the  population. 

Looking  at  the  error  in  the  prediction  of  amount  of  savings  of 
each  class,  it  can  be  seen  that  Class  XI  is  a  major  contributor  to  the  total 
error.    It  was  suspected  that  the  model  failed  because  of  sampling  errors 
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TABLE  XXI 

ERRORS  IN  PREDICTING  THE  AMOUNT  OF  SAVINGS 
IN  THE  VALIDATION  SAMPLE 


CLASS 

PREDICTED 
#  OF  A/C 

ACTUAL  # 
OF  A/C 

ERROR  IN 
#  OF  A/C 

ERROR  IN 
AMOUNT  OF 
SAVINGS 

II  &  III 

3435 

3699 

-264 

-182759 

IV 

342 

222 

+  120 

+589920 

V 

180 

144 

+ 

36 

+246780 

VI 

95 

103 

- 

8 

-    71240 

VII 

138 

93 

+ 

45 

+484065 

VIII 

72 

60 

+ 

12 

+155040 

IX 

52 

41 

+ 

11 

+164571 

X 

48 

50 

- 

2 

-   35582 

XI 

122 

70 

+ 

52 

+1450175 

TOTAL 

4484* 

4482 

+ 

2** 

+2800971 

*        Should  equal  4482.     Discrepancy  caused  by  rounding  error 
**     Should  equal  0.     Discrepancy  caused  by  rounding  error 

which  resulted  in  estimating  higher  probabilities  of  transition  between 
classes  with  low  average  amount  of  savings  and  those  with  large  average 
amount  of  savings.    ■ 

To  check  out  this  hypothesis  the  following  changes  were  made 
to  CPM  X: 

1.         Accounts  found  to  have  made  two  or  more  transitions 
between  Classes  II,  III,  IV  and  V  and  Classes  VIII,  IX,  X  and  XI  were 
removed  from  the  data  base  as  these  accounts  would  not  be  representative 
of  the  normal  behavior  of  the  population.     Eight  accounts  were  rejected 


according  to  this  rule  and  CPM  X  was  recomputed  with  the  remaining  six 
hundred  and  fourteen  accounts.     This  modified  transition  matrix  was  termed 
MOD  I. 

2.  The  90%  lower  confidence  limit  was  estimated  for  trans- 
ition probabilities  from  Classes  II,  III,  IV  and  V  to  higher  classes.     The 
Poisson  distribution  was  used  to  approximate  the  binomial  distribution 

in  cases  when  the  total  number  of  transitions  observed  was  below  seven. 
The  normal  approximation  was  used  when  the  number  of  transitions  observed 
exceeded  seven.     This  modification  was  applied  to  MOD  I  and  termed 
MOD  II. 

3.  Further  adjustments  were  made  to  a  few  transition  proba- 
bilities based  on  the  results  of  the  chi  square  fit  using  MOD  I  and  MOD  II. 
The  rationale  for  the  adjustments  is  as  follows: 

Since  the  data  base  of  accounts  is  inadequate  for  estimation 
of  population  parameters,  use  the  additional  data  available  from  the 
validation  sample  to  correct  the  estimation  of  certain  parameters.    Hypothe- 
size that  the  new  matrix,  termed  MOD  III,  as  the  best  estimate  and  proceed 
with  the  prediction  of  total  number  of  accounts  and  total  amount  of  savings 
in  the  institution.    A  good  fit  between  predicted  total  amount  of  savings 
over  the  prediction  interval  would  give  support  to  the  hypothesis. 

MOD  I,  MOD  II  and  MOD  III  are  contained  in  Appendix  E. 

The  results  obtained  using  the  modified  matrices  are  compared 
against  predictions  using  CPM  X  in  Tables  XXII  and  XXIII. 
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It  can  be  seen  from  Table  XXII  that  the  structure  of  the  pre- 
dicted distribution  changed  substantially  with  each  modification.     The 
improvement  in  fit  in  the  predicted  distribution  with  each  modification 
had  a  corresponding  effect  in  the  prediction  of  total  amount  of  savings. 
However,  the  predicted  total  number  of  accounts  were  marginally  degraded 
by  each  modification.     The  changes,  however,  were  not  considered  to  be 
significant  as  the  percentage  error  was  still  of  the  order  of  less  than  one 
percent. 

Though  the  modifications  to  the  transition  matrix  improved  the 
predictions  they  do  not  prove  that  the  true  transition  matrix  should  be  as 
specified  by  MOD  III.     However,  with  the  amount  of  information  available 
the  best  estimate  of  the  transition  matrix  is  MOD  III.     Although  its  ability 
to  predict  the  structure  of  the  population  has  not  been  put  to  a  test,  the 
accurate  prediction  of  total  amount  of  savings  encourages  one  to  believe 
that  MOD  III  is  close  to  the  true  matrix. 

TABLE  XXIII 

MODEL  I  PREDICTIONS  OF  TOTAL  NUMBER  OF  ACCOUNTS  AND  AMOUNT 
OF  SAVINGS  ($M)  USING  CPM  X,  MOD  I,   MOD  II  AND  MOD  III 


QUARTER 

11-72 

III- 72 

IV- 72 

1-73 

11-73 

TOTAL 

CPM  X 

17345 

17447 

17554 

17664 

17776 

NUMBER 
OF 

MOD  I 

17336 

17428 

17525 

17625 

17726 

ACCOUNTS 

MOD  II 

17335 

17424 

17516 

17609 

17702 

MOD  III 

17329 

17405 

17408 

17552 

17622 

ACTUAL 

17354 

17483 

17485 

17746 

17820 

TOTAL 

CPM  X 

45.65 

49.87 

53.78 

57.39 

60.74 

AMOUNT 

MOD  I 

44.64 

47.97 

51.06 

53.97 

56.67 

OF 

MOD  II 

43.00 

44.80 

46.43 

47.96 

49.38 

SAVINGS 

MOD  III 

41.94 

42.74 

43.48 

44.16 

44.79 

ACTUAL 

41.57 

42.15 

42.40 

44.13 

44.56 
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3  .  Estimates  of  the  Fundamental  Matrix 

The  fundamental  matrix  (I  -  Q)       was  estimated  by  substituting 
Q  from  CPM  X  into  the  expression.     It  is  displayed  in  Table  XXIV. 

The  ij       element  of  this  matrix  is  the  expected  number  of  time 
periods  that  a  new  account  beginning  in  Class  i  will  spend  in  Class  j 
before  closing.     Thus  a  new  account  joining,  say,  Class  IV   will  on  the 
average  visit  Class  V    for  2.4562  periods  during  its  entire  life  in  the  system 

The  expected  total  time  a  new  account  which  joins  Class  i 
spends  in  the  system  is  the  sum  of  the  ith  row  of  the  fundamental  matrix, 
M. 

The  equilibrium  distribution  is  obtained  by  multiplying  the 
distribution  of  arrivals  by  M.    The  results  obtained  are  presented  in  Table 
XXVI.      Results  obtained  using  MOD  III  are  also  presented. 

The  results  are  interesting  in  that  they  are  predictions  of  the 

final  state  of  the  population  if  current  conditions  were  to  prevail.     This 

state  of  equilibrium  is  reached  when  the  number  of  new  accounts  opened 

per  quarter  balances  the  number  of  accounts  closed,  and  the  number  of 

accounts  moving  out  of  each  class  is  balanced  by  a  corresponding  number 

of  accounts  moving  in  from  other  classes.     The  Fundamental  matrix  obtained 

with  CPM  X  predicts  that  the  population  will  grow  from  17251,  at  Quarter 

1-72,  to  a  final  value  of  21734.     The  population  of  each  class  grows 

larger  except  for  Class  II.    However,  as  noted  earlier,  CPM  X  did  not 

predict  the  total  amount  of  savings  accurately;  therefore,  projection  of 

the  equilibrium  distribution  using  it  has  little  value  except  to  constrast 

with  the  results  obtained  with  MOD  III. 

85 


2: 


O 

LU 


CO 

O 

> 

1— ( 

X 

3 

1— < 

ex 

w 
Hi 

CQ 

< 

2- 

< 

H 

_l 

U.I 

< 

Q 

ID 

u_ 

UJ 

X 


o 


LT\ 


CO 


CO 
CM 


O 


O 


CO 


Is-        •— i 


ro 


LA 
(Nl  — < 


CM 
Is- 
>t 

so 


o 
o 
Is- 


vt 

r- 


sO 
sO 
sO 


o 

CO 

o 


ro 


in 
st 


CNJ 
s0 

r» 
*t 


Is- 
co 

cm 


r- 
o 

CNJ 

Is- 


ro 

vO 
CO 
CNJ 


ro 
in 

r-( 

r- 

• 


ro 

O 
ro 
ro 
o 


CO 

in 
in 


ro 

so 

CNJ 

in 

CO 

so 


o 

in 


CO 

r- 
o 
ro 

co 

• 


rO 
sO 
c\l 
ro 


r-4 

— .i 

— i 

CNJ 

CNJ 

CNJ 

ro 

ro 

in 

O 

in 

Is- 

*■ 

o 

Is- 

cr< 

co 

00 

CO 

O 

n0 

o 

■— i 

00 

in 

■— i 

r-4 

CNJ 

st 

in 

cr 

vO 

CT> 

CNJ 

o 

»r 

St 

oo 

l—l 

in 

<t 

vO 

Is- 

ON 

-4 

ro 

O 

st 

CNJ 

^t 

o 

o 

o 

o 

r-4 

r-i 

CNJ 

CNJ 

m 

CM 

*t 

os 

•-4 

o 

CNJ 

CNJ 

m 

Is- 

co 

in 

Is- 

st 

o 

st 

CNJ 

m 

in 

Is- 

r- 

o 

m 

(O 

<t 

r— 1 

ro 

ro 

—4 

m 

o 

s0 

<t 

sO 

Is- 

o 

■— 1 

vf 

St 

o-l 

st 

CM 

o 

o 

o 

o 

r-4 

I— 1 

CNJ 

>t 

r-4 

—4 

Is- 

CNJ 

o 

vO 

co 

CO 

r-4 

Is- 

CNJ 

Is- 

h- 

st 

m 

co 

ON 

Is- 

vt 

-t 

Nf 

st 

0s 

fNJ 

sO 

CNJ 

CNJ 

o 

sO 

o 

r-J 

Is- 

m 

co 

on 

r— 1 

<l- 

o 

—1 

t> 

sf 

co 

o 

o 

o 

r- 1 

— 1 

CNJ 

-t 

f— i 

r—t 

— 1 

—j 

CO 

CO 

o 

ro 

in 

co 

cr 

st 

in 

CO 

CO 

sO 

CO 

CO 

Ul 

m 

CJ 

o 

ro 

vO 

CO 

ro 

vO 

o 

OS 

o 

o 

in 

r-4 

r-j 

in 

on 

•O 

r- 

vt 

ro 

CO 

st 

in 

-j 

— i 

— < 

(NJ 

ro 

O 

ro 

CNJ 

CNJ 

CM 

CM 

ro 

<* 

ro 

r-- 

ro 

s0 

-J 

CNJ 

in 

s0 

<}- 

CNJ 

o 

r-4 

r- 

<?s 

sU 

sO 

Is- 

Is- 

NT 

0 

vO 

sO 

■— i 

C 

ro 

o 

co 

-O 

o^ 

— 1 

vO 

CJ 

CNJ 

CNJ 

CNJ 

— 1 

o 

o 

o 

r-4 

r-4 

CO 

- 

r-4 

— i 

r~4 

r-4 

in 

CNJ 

CNJ 

CO 

CO 

ro 

•st 

r-4 

Cjs 

Is- 

<t 

CO 

O 

vO 

0> 

o 

-o 

i—l 

OS 

o 

o 

m 

in 

Is- 

•O 

st 

O 

CO 

co 

CM 

CM 

CO 

vf 

00 

o 

o 

Is- 

00 

s0 

r- 

-j 

«— i 

CNJ 

st 

INJ 

o 

CNJ 

sO 

in 

■o 

o 

ro 

m 

o 

sO 

Is- 

<J" 

Is- 

1  -t 

I— 1 

CNJ 

st 

—4 

r-4 

r- 

s0 

o 

m 

CNJ 

in 

CNJ 

CNJ 

in 

CO 

ro 

CM 

o 

r-4 

ro 

Is- 

ro 

in 

s0 

st 

■<t 

CNJ 

CO 

o 

ro 

CNJ 

CNJ 

CNJ 

CM 

CNJ 

CM 

vj" 

co 

CNJ 

in 

CNJ 

in 

—i 

O 

CNJ 

00 

o 

CNJ 

i—4 

ro 

p— 1 

o 

in 

Is- 

r- 

00 

in 

ro 

-t 

00 

CO 

CO 

on 

Is- 

st 

ro 

o 

in 

ro 

r— 1 

m 

— i 

•st 

ro 

rri 

O 

ro 
oo 


in 

s0 


CO 

< 


CM 


ro 


in 


sO 


cx> 


o 


86 


o 
o 


w 

CQ 
< 


Q 

UJ 


CO 

o 


C£ 

I- 
< 


< 
o 

z: 

u_ 

UJ 

X 


Is- 

O 

CO 

CM 

CM 

Cj> 

CO 

CM 

o 

r- 

in 

CM 

a> 

CO 

•o 

r-l 

o 

in 

CO 

CO 

0^ 

O 

o 

CO 

<f 

O 

00 

<i 

o 

r-l 

— 1 

r-l 

<f- 

vO 

CO 

CM 

Is- 

in 

CO 

0^ 

r-| 

o 

o 

o 

o 

r-4 

r-l 

CM 

i\i 

•<r 

o 

—1 

o 

in 

O 

O 

o 

<t 

CO 

00 

CO 

oo 

CM 

r- 

h- 

r- 

CO 

CO 

o 

a> 

0^ 

o 

i—< 

CM 

co 

o 

Is- 

O 

ro 

•o 

Is- 

<) 

o 

•— « 

CM 

CO 

<t 

vO 

-* 

co 

CM 

O 

CI 

•r"^ 

O 

o 

o 

o 

o 

— 1 

t— i 

CM 

in 

CM 

o 

r- 

•—1 

CM 

00 

CO 

oo 

00 

CJ^ 

n0 

o 

o 

CO 

CO 

o 

o 

a* 

CM 

o 

CO 

CM 

vO 

o 

CO 

oo 

CO 

CM 

CM 

o 

r«- 

o 

I— 1 

CM 

co 

in 

r- 

CM 

CM 

O 

(O 

o 

o 

O 

o 

o 

o 

— 1 

CM 

vt 

r^ 

r-l 

o 

O 

r*- 

— i 

o 

in 

r-l 

m 

co 

CO 

CO 

r-t 

•— i 

CM 

Is- 

in 

r-4 

r» 

Is- 

o 

Is- 

•o 

r- 

in 

o 

00 

<r 

cn 

r- 

-^ 

co 

i-4 

ro 

«r 

V0 

o 

O 

CO 

in 

o 

o 

O 

o 

o 

o 

o 

r-l 

CO 

r-l 

r-l 

r-4 

o 

o 

■— « 

r- 

CO 

CO 

co 

r-4 

CO 

CM 

Is- 

en 

o 

m 

o 

CM 

co 

co 

vO 

CM 

O 

& 

CO 

-3- 

CO 

CM 

vO 

CO 

o 

tr* 

r- 

CM 

in 

o 

o 

CM 

O 

•40 

r-l 

Is- 

GO 

o 

o 

o 

r-4 

CM 

in 

CM 

CM 

r—l 

r-l 

Is- 

s0 

CM 

r-l 

<f 

vt 

in 

o 

CM 

o 

co 

o 

*-4 

r-4 

O 

CO 

in 

sO 

o 

0> 

sf 

r- 

Is- 

•^ 

CM 

r-4 

CO 

CM 

o 

CO 

vO 

co 

-o 

i— i 

UJ 

C*> 

CO 

CM 

CO 

r-l 

r-l 

o 

o 

—i 

r-l 

CO 

r-l 

I—I 

r-l 

r-l 

—1 

in 

r-l 

o 

CM 

t— i 

0^ 

CO 

CJ> 

M0 

0> 

in 

in 

-T 

00 

CO 

CM 

o 

in 

CM 

00 

o 

CO 

r-4 

in 

cm 

r— 1 

in 

o 

CM 

o 

m 

in 

o 

o 

in 

CO 

CO 

CO 

in 

CO 

co 

o 

r- 1 

CM 

<r 

in 

CM 

r- 

00 

CO 

oo 

o 

«* 

CO 

CM 

r- 

CM 

r-4 

in 

-J 

—i 

Is- 

CO 

in 

CO 

o 

in 

o 

CM 

sf 

0^ 

CM 

r-l 

CO 

r~ 

>* 

vO 

n0 

1— 1 

CM 

oo 

CO 

in 

Is- 

<T 

<f 

o 

— i 

in 

CM 

v0 

r-l 

CM 

— i 

O 

in 

00 

CO 

00 

CM 

CO 

CO 

CM 

Is- 

CO 

in 

0^ 

00 

r- 

co 

vt 

O 

— < 

s0 

<* 

CM 

o 

xh 

h- 

CM 

m 

—1 

o 

o 

co 

CM 

r- 

o 

0> 

00 

in 

CM 

Is- 

<f 

CO 

CO 

CM 

CO 

CM 

CM 

CM 

O 

CO 

CM 

^ 

r-l 

o 

CO 

CT> 

CM 

O 

CM 

CO 

«t 

o 

—4 

co 

o 

CO 

st 

o 

CO 

st" 

O 

o 

■4- 

Is- 

<f 

sO 

oo 

o 

cm 

o 

o 

o 

o 

o 

in 

—1 

00 

O 

CO 

CO 

CO 

r^ 

o 

O 

00 

cr 

00 

CO 

CO 

oo 

oo 

< 

CM 

CO 

<T 

in 

O 

r- 

00 

cr 

o 

— i 

87 


w 

CQ 
< 


CO 

O 

CO 
Ph 

o 
!3 

O 

< 

Q 

< 

o 


CM 

I 
I— I 

w 

H 


ID 

a 

< 
CO 

w 

|3 

s 


H 
^  Q 


CO 


H 

CO 

Q  w 

\2  o 


p 

< 

CO 
CO 


Pi 


S3 

CO 
o  ° 

co  3 
w  S 

< 


H 
CO 
W 


CO 


O  ^  O 

«  H  S 

1 — I  /^-\  I — I 

w  y  > 

(V  H  <*■ 


O  < 

tl  w  £? 

>->  £»  l-H 

<  co  a 


CO 

< 

h-1 
O 
P 

O 

I — ! 
1-1 
h-1 


13 

i — i 

CQ 


O 

i — i 

H 
t> 

CQ 
i — i 

OS 
H 
CO 


13 


H    CM 

I 

I — I    t~l 

q  a 


< 
> 

2 
2 

< 


Q 
O 


X 


Oh 
O 


P 

o 


X 


0-4 

O 


o 

c^. 

^r 

i— 1 

^ 

o 

CO 

O 

r— 1 

r^- 

CM 

o 

CM 

CO 

CM 

CO 

■^ 

CD 

LO 

t-^ 

^r 

^r 

CO 

I— 1 

CM 

CM 

CO 

o 

CO 

CO 

CM 

vr 

t^ 

ttf 

CO 

^ 

^ 

CO 

LO 

CO 

CO 

^r 

t—i 

co 

LO 

CO 
CM 

co 


CO 


co 

CO 
CO 


CD 


CD 


o 

CO 

o 


CO 


o 

CD 
LO 


CO 


CM 


LO 

CO 


CO         LO 


r^ 


CD 


CO 
CM 
CO 


CO 
CO 


CM 


CO 


CM 


CO 

c^. 

CO 

CD 

CD 

CM 

o 

CD 

CO 

C^> 

LO 

o 

CO 

CO 

LO 

LO 

O 

o 

CO 

LO 

i— 1 

LO 

co 

o 

o 

CO 

CM 

o 

r^ 

CM 

CM 

[\ 

CO 

sf 

LO 

LO 

CO 

CO 

«tf 

CM 

CM 

CO 

LO 

CD 

oo 

CD 

o 

i— 1 

co 

CO 

CO 

1—1 

■^ 

O 

CO 

CO 

r^ 

CO 

CO 

1—1 

CO 

CO 

CO 

CM 

■<cr 

1 — 1 

CO 

LO 

1— 1 

CO 

CO 

^r 

■5? 

CM 

CM 

CN) 

LO 

CO 

CO 

CM 

CD 

f— 1 

r>- 

otf 

CD 

^r 

O 

CO 

sr 

CO 

^r 

■^ 

c^ 

CD 

t^. 

r^ 

<# 

r^ 

CD 

CO 

^r 

t^ 

CO 

CM 

CO 

o 

i— i 

CD 

rH 

LO 

■^r 

LO 

CO 

r^ 

o 

CO 

CM 

i— i 

I— I 

■ — i 

i— i 

CM 


CO 

CO 

"^ 

CO 

CO 

CM 

CD 

CO 

CO 

LO 

1—1 

t^- 

CD 

CO 

CO 

CO 

c^ 

o 

LO 

CO 

o 

LO 

co 

r^ 

o 

LO 

CO 

CO 

CM 

i—l 

i-H 

CM 

CM 

CM 

i—i 

r— 1 

C^ 

COLOCOLOOCMr^CDCOCMO 
CM         LO        CO         i— <         ■ — I         • — t  CMO 

CD  CO 


CO 
CO 


o 


I — II — I        t-> 


o 


*>  (— I  I— I  M  V  t — I  V — / 


88 


The  Fundamental  matrix  obtained  with  MOD  III  produced  rather 
believable  kind  of  predictions.     It  predicted  that  the  total  number  of 
accounts  will  grow  to  a  maximum  of  19363  and  each  class  grows  larger 
at  the  same  time.     The  equilibrium  amount  of  savings  in  the  population 
will  be  $53.74  million.     Thus,   if  current  conditions  will  prevail  the  insti- 
tution can  expect  a  growth  of  another  $10  million,  from  the  current  level 
of  $44  million  (as  at  30  June  1973),  in  the  passbook  accounts. 

The  population  under  consideration,  however,  did  not  include 
accounts  greater  than  $100,000.    A  separate  study  will  therefore  be  required 
to  predict  the  equilibrium  number  of  accounts  in  this  group  of  accounts 
which  numbered  six,  at  Quarter  1-72  . 

The  expected  length  of  stay  of  accounts  in  the  system  are 

presented  in  the  following  table: 

TABLE  XXVII 

EXPECTED  LENGTH  OF  STAY  IN  THE  SYSTEM 
COMPUTED  WITH  CPM  X  AND  MOD  III 

CLASS  LENGTH  OF  STAY  IN  SYSTEM 

(QUARTERS) 


CPM  X 

MOD  III 

II 

26 

23 

III 

29 

27 

IV 

29 

26 

V 

29 

27 

VI 

29 

27 

VII 

29 

27 

VIII 

31 

29 

IX 

31 

29 

X 

32 

30 

XI 

33 
89 

31 

The  expected  length  of  stay  in  the  system  is  almost  constant 
for  all  the  classes  except  for  Classes  II  and  XI.     The  conclusion  that  can 
be  drawn  from  this  observation  is  that  the  length  of  stay  of  a  saver,  in 
the  system,  is  relatively  indifferent  to  the  amount  of  savings  he  started 
out  with.     The  shorter  life  of  accounts  in  Class  II  is  a  fact  that  has  been 
noticed  previously.     The  longer  life  of  accounts  in  Class  XI  is  contrary  to 
expectation,  as  one  would  expect  savers  who  do  not  have  immediate  need 
for  such  large  sums,  to  transfer  the  passbook  account  into  other  types  of 
savings  account  which  yield  higher  earnings.     The  observation  may  be 
explained  if  these  savers  do  not  close  their  account  when  funds  are  trans- 
ferred to  other  types  of  accounts.     The  length  of  stay  would  then  reflect 
the  length  of  time  a  saver  wishes  to  remain  a  customer  of  the  savings 
institution.     The  Fundamental  matrix  using  CPM  X  predicts,  on  the  average, 
lengths  of  stay  of  29.8  periods  whereas  the  Fundamental  matrix  using 
MOD  III  predicts  2  7.6  periods.     The  smaller  total  number  of  accounts 
predicted  using  MOD  III  can  be  explained  by  the  fact  that  customers  spend 
less  time  in  the  system. 

Thus,  the  model  shows  that  efforts  to  keep  customers  in  the 
system  are  as  important  as  attracting  new  customers  into  the  system. 

B.         VALIDATION  OF  MODEL  II 

1 .         Prediction  of  Sample  Population  Behavior 

The  transition  matrices  used  in  predicting  the  behavior  of  the 
sample  were  estimated  by  the  method  described  in  Chapter  II,  Section  B.  8. 
The  elements  of  the  transition  matrices  that  did  not  have  predictors  were 
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taken  from  CPM  V,  the  estimate  of  the  time  stationary  transition  matrix 
using  data  from  the  first  five  quarters.     The  predicted  matrices  are  contained 
in  Appendix  F.     The  predicted  number  of  accounts  in  each  class  was  com- 
pared against  the  actual  number  observed.     The  chi  square  test  was  used 
to  determine  the  goodness  of  fit  between  the  predicted  and  observed  dis- 
tribution of  accounts  in  the  sample. 

The  results  are  presented  in  Appendix  I.    It  was  found  that 
the  predictions  matched  the  observations  very  closely  for  the  first  four 
quarters.     The  chi  square  statistic  of  each  of  the  first  four  quarters  was 
less  than  6.7.     However,  the  predictions  for  the  fifth  quarter  were  extremely 
poor.     The  chi  square  statistic  was  25.02.     If  the  null  hypothesis  that  the 
predicted  and  observed  distributions  are  the  same  were  true,  then  this 
chi  square  statistic  would  be  obtained  0.5  percent  of  the  time.    The  null 
hypothesis  could  thus  be  safely  rejected  at  the  10%  level  of  significance. 

An  investigation  of  the  causes  of  the  failure  of  the  model  to 
predict  accurately  for  Quarter  11-73  showed  that  the  ten  predictions  of 
transition  probabilities  for  Quarter  11-73  had  altered  the  transition  matrix 
for  Quarter  11-73  substantially.    Two  exogenous  variables  X.  ,  prime  com- 
mercial paper  rate,  4-6  months  and  X    ,  U.  S.  Government  securites  rate, 
6  months,  were  considerably  higher  in  Quarter  11-73  than  in  the  earlier 
quarters.    Thus  the  predictors  were  used  beyond  the  data  base  from  which 
they  were  derived.     This  could  lead  to  unexpected  results. 

To  verify  the  hypothesis  that  Model  II  failed  in  Quarter  11-73 
because  of  the  use  of  some  predictors  beyond  the  data  base  on  which  they 
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were  derived,  predictions  were  repeated  using  a  matrix  with  predictors  that 

had  X    as  an  explanatory  variable  removed.     The  chi  square  statistic  obtained 
o 

with  this  modified  matrix  was  14.87,  a  substantial  improvement  from  that 
obtained  without  the  modification.     The  ninetieth  percentile  of  the  chi 
square  distribution  with  ten  degrees  of  freedom  is  15.99.     Thus  the  null 
hypothesis  could  not  be  rejected  at  the  10%  level  of  significance.     It  was 
therefore  concluded  that  hypothesis  on  the  failure  of  the  model  is  correct. 
2  .         Prediction  of  Population  Behavior 

The  complete  Model  II  was  used  in  the  prediction  of  the  behavior 
of  the  population.     The  predicted  number  of  new  accounts  opened  in  each 
quarter  was  computed  in  Chapter  III,  Section  B.  9.     The  predicted  number 
of  new  accounts  entering  each  class  was  presented  in  Chapter  III,  Section 
B.   10.     The  transition  matrix  used  was  the  same  as  that  used  in  the  pre- 
diction of  sample  population  behavior  in  sub-section  1 . 

With  experience  gained  in  earlier  predictions  with  Model  I, 
high  predicted  total  amount  of  savings  was  expected.     The  modifications 
applied  to  the  transition  matrix  of  Model  I  were  also  applied  to  Model  II. 
The  predicted  total  number  of  accounts  and  total  amount  of  savings  are 
presented  in  Table  XXVIII. 

The  total  number  of  accounts  predicted  by  Model  II  matched 
the  observed  values  closely  for  Quarters  11-72,  III- 72  and  IV-72,  but 
diverged  quite  widely  by  Quarter  11-73.    The  predicted  total  amount  of 
savings  was  high  but  the  divergence  increased  substantially  in  Quarter 
11-73. 
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TABLE  XXVIII 

PREDICTED  TOTAL  NUMBER  OF  ACCOUNTS  AND  TOTAL  AMOUNT  OF  SAVINGS 
($M)  BY  MODEL  II,  USING  CPM  X,  MOD  I,  MOD  II  AND  MOD  III 


QUARTER 

11-72 

111-72 

IV-72 

1-73 

11-73 

TOTAL 

CPM  X 

17307 

17380 

17448 

17985 

18547 

NUMBER  OF 
ACCOUNTS 

MOD  I 

17305 

17374 

17438 

17973 

18534 

MOD  II 

17304 

17370 

17430 

17966 

18531 

MOD  III 

17304 

17364 

17414 

17953 

18526 

ACTUAL 

17354 

17483 

17485 

17746 

17820 

TOTAL 

CPM  X 

45.61 

49.79 

53.68 

58.20 

62.49 

AMOUNT  OF 
SAVINGS 

MOD  I 

44.59 

47.88 

50.98 

54.80 

58.47 

(MILLION 

MOD  II 

42.95 

44.69 

46.32 

48.75 

51.13 

DOLLARS) 

MOD  III 

41.90 

42.64 

43.31 

44.80 

46.19 

ACTUAL 

41.57 

42.15 

42.40 

44.13 

44.56 

The  hypothesis,  that  the  model  failed  to  yield  accurate  pre- 
dictions because  the  predictors  of  transition  probabilities  were  used  beyond 
the  range  of  data  used  to  obtain  the  predictors,  was  put  to  another  test  by 

predicting  with  a  transition  matrix  that  had  predictors  with  X    as  explana- 

o 

tory  variable  removed.       The  predictions  are  presented  in  Table  XXLX. 

It  can  be  seen  that  the  predicted  total  number  of  accounts  has 
improved  considerably  by  this  change  to  the  transition  matrices.     The 
improvement  to  predictions  of  total  amount  of  savings  is  not  so  pronounced. 

The  validation  sample  of  4483  accounts  taken  from  the  Quarter 
1-73  population  was  used  to  check  if  Model  II  predicted  the  population 
structure  accurately.     The  predictions  obtained  with  CPM  X,  MOD  I,  MOD 
II  and  MOD  III  are  presented  in  Table  XXX      Predictions  by  Model  II'  are 
presented  in  Table  XXXI. 
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TABLE  XXIX 

PREDICTED  TOTAL  NUMBER  OF  ACCOUNTS  AND  TOTAL 

AMOUNT  OF  SAVINGS  BY  MODEL  II' 


QUARTER 

11-72 

111-72 

IV-72 

1-73 

11-73 

TOTAL 

CPM  X 

17320 

17373 

17404 

17738 

18068 

NUMBER  OF 
ACCOUNTS 

MOD  I 

17310 

17354 

17375 

17697 

18016 

MOD  II 

17310 

17350 

17365 

17681 

17992 

MOD  III 

17309 

17343 

17346 

17645 

17937 

ACTUAL 

17354 

17483 

17485 

17746 

17820 

TOTAL 

CPM  X 

45.62 

49.69 

53.34 

57.44 

61.23 

AMOUNT  OF 
SAVINGS 

MOD  I 

44.60 

47.76 

50.62 

54.01 

57.14 

(MILLION 

MOD  II 

42.96 

44.58 

46.00 

48.04 

49.92 

DOLLARS) 

MOD  III 

41.90 

42.55 

43.06 

44.28 

45.37 

ACTUAL 

41.57 

42.15 

42.40 

44.13 

44.56 

It  can  be  seen  that  the  predicted  distribution  improved  with 
each  modification.     The  error  in  predicting  the  total  amount  of  savings 
can  be  attributed  to  the  error  in  the  prediction  of  number  of  accounts  in 
each  class.    As  an  example,  the  error  in  predicting  the  number  of  accounts 
in  Classes  XI,  VII  and  IV  could  account  for  $2.9  million  in  the  prediction 
of  total  amount  of  savings  for  Quarter  1-73  using  MOD  II. 

Though  the  predicted  distribution  using  MOD  III  fitted  the 
observed  distribution  very  closely,  the  error  in  predicting  the  number  of 
accounts  in  Class  XI  could  account  for  $0.67  million  of  the  error  in  pre- 
dicting the  total  amount  of  savings  for  the  entire  population.     This  again 
demonstrates  the  importance  of  accurate  predictions  of  number  of  accounts 
in  classes  with  large  average  amount  of  savings. 
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C.         COMPARISON  OF  MODEL  I  AND  MODEL  II 
1 .         Sample  Population  Behavior 

The  chi  square  statistics  obtained  in  the  test  of  goodness  of 
fit  between  the  predicted  distributions  and  the  observed  distribution  were 
used  as  a  measure  of  the  predictive  power  of  the  two  models. 

Model  II '  denotes  Model  II  modified  by  the  deletion  of  five 

predictors  of  transition  probabilities  which  had  X    as  an  explanatory 

o 

variable.     The  chi  square  statistics  obtained  with  Model  I,  Model  II  and 
Model  IV  are  presented  in  Table  XXXII. 

TABLE  XXXII 

COMPARISON  OF  CHI  SQUARE  STATISTICS 
OBTAINED  WITH  MODELS  I,  II  AND  II' 


QUARTER 

CPM         MODEL            11-72 

111-72 

IV-72 

1-73 

11-73 

V 

I                  3.49 

2.45 

11.05 

10.74 

11.91 

V                 I 

I                  3.60 

1.98 

6.70 

4.35 

2  5.02 

V                 I 

I'                 3.47 

2.19 

8.69 

8.84 

14.87 

II 

I                  7.59 

13.84 

35.67 

51.11 

65.97 

II                 I 

I               •  6.76 

11.46 

24.65 

33.01 

76.64 

II                  I 

I'                 6.99 

12.83 

30.73 

43.05 

68.39 

X 

I                  3.26 

1.12 

5.93 

5.03 

3.57 

X                 I 

I                  2.97 

0.94 

3.82 

1.26 

15.92 

X                 I 

V                3.17 

1.05 

4.71 

3.99 

7.01 

Except  for  Quarter  11-73,  Model  II  was  generally  superior  to 
Model  I.     Model  II1  improved  the  predictions  for  Quarter  11-73  but  did 
not  perform  as  well  as  Model  II  for  the  other  quarters.     The  results  were 
expected    as  Model  II,  having  greater  flexibility,  should  perform  better 
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under  normal  situations.     Model  IP,  with  only  five  predicted  elements  in 
its  transition  matrix,  would  be  expected  to  be  less  responsive  to  changes 
in  external  conditions,  thus  would  not  perform  as  well  as  Model  II.    Model 
I,  being  completely  indifferent  to  external  conditions,   should  be  expected 
to  be  the  poorest  performer  among  the  three  models. 

The  predicted  total  amount  of  savings  predicted  by  Models  I, 
II  and  II'  are  presented  in  Table  XXXIII. 

TABLE  XXXIII 

COMPARISON  OF  PREDICTED  TOTAL  AMOUNT  OF 
SAVINGS  ($M)  BY  MODELS  I,  II  AND  II' 


CPM 

MODEL 

11-72 

111-72 

IV- 72 

1-73 

11-73 

V 

I 

3.672 

3.509 

3.351 

3.201 

3.057 

V 

II 

3.674 

3.507 

3.346 

3.201 

3.043 

V 

II' 

3.669 

3.497 

3.329 

3.180 

3.024 

II 

I 

3.553 

3.293 

3.060 

2.851 

2.664 

II 

II 

3.574 

3.329 

3.114 

2.937 

2.764 

II 

II" 

3.567 

3.315 

3.092 

2.902 

2.713 

X 

I 

3.727 

3.613 

3.500 

3.389 

3.280 

X 

II 

3.729 

3.609 

3.488 

3.373 

3.234 

X 

II' 

3.726 

3.603 

3.479 

3.367 

3.241 

ACTUAL 

3.627 

3.535 

3.509 

3.404 

3.418 

The  predictions  between  the  three  models  were  pretty  close. 
In  view  of  the  variability  of  the  predictions  of  total  amount  of  savings  it 
was  not  possible  to  state  which  of  the  three  models  performed  better. 
2  .         Behavior  of  Entire  Population 

Both  models  predicted  total  number  of  accounts  very  closely 
for  the  first  three  quarters  .    The  performance  of  Model  II  deteriorated 
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badly  in  the  fifth  quarter,  Quarter  11-73.    The  failure  of  Model  II  in 

Quarter  11-73  was  attributed  to  the  failure  of  the  predictors  of  transition 

probabilities  to  predict  beyond  the  data  base  from  which  they  were  derived. 

Predictions  made  with  a  matrix  modified  by  the  removal  of  predictors  which 

had  X    as  an  explanatory  variable  were  closer  to  the  actual  value  for 

Quarters  1-73  and  11-73  than  predictions  by  Model  II.     Table  XXXIV  compares 

the  total  number  of  accounts  predicted  by  Model  I,  Model  II  and  Model  II'  , 

Model  II  modified  as  described  above. 

TABLE  XXXIV 

TOTAL  NUMBER  OF  ACCOUNTS  PREDICTED  BY 
MODEL  I,  MODEL  II  AND  MODEL  II*  USING  MOD  III 

MODEL  11-72         11-72  IV-72  1-73  11-73 

TOTAL  I  17329         17405         17480         17552         17622 

^^^?I  II  17304         17364         17414         17953         18526 

ACCOUNTS 

II'  17309         17343         17346         17645         17937 

ACTUAL         17354         17483         17485         17746         17820 

It  can  be  seen  that  Model  I  predictions  are  closer  to  the 
observed  values  for  the  first  three  quarters.     However,  unlike  Models 
II  and  II1,  Model  I  could  not  predict  the  sudden  increase  in  the  number 
of  accounts  in  Quarter  1-73.     This,  again,  shows  that  Model  I  is  appli- 
cable only  when  external  conditions  remain  constant. 

Both  models  were  equally  bad  in  predicting  the  total  amount 
of  savings.  The  cause  for  the  failure  was  attributed  to  sampling  errors. 
Simirar  modifications  were  made  to  the  transition  matrix  of  both  models. 
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The  improvement  finally  achieved  was  substantial  as  can  be  seen  in  the 
following  table: 

TABLE  XXXV 

COMPARISON  OF  TOTAL  AMOUNT  OF  SAVINGS  PREDICTED 
BY  MODELS  I,  II  AND  II'  FOR  QUARTER  11-73 

MODEL       CPM  X  MOD  I        MOD  II      MOD  III       ACTUAL 

TOTAL  I  60.74  56.67  49.38  44.79  44.56 

AMOUNT 

QP  II  62.49  58.47  51.13  46.19  44.56 

SAVINGS  II'  61.23  57.14  49.92  45.37  44.56 

Predictions  using  CPM  X,  MOD  I  and  MOD  II  are  so  different 
from  the  observations  that  the  difference  between  Model  I  and  Model  II' 
predictions  are  considered  insignificant.     In  the  case  of  predictions  made 
using  MOD  III,  the  errors  between  prediction  and  observation  are  too  small 
to  discriminate  between  Model  I  and  Model  II'  using  just  one  point.    Thus, 
Table  XXXVI  comparing  the  predictions  of  the  three  models  using  MOD  III 
over  the  entire  period  of  prediction,   is  presented  below. 

TABLE  XXXVI 

COMPARISON  OF  TOTAL  AMOUNT  OF  SAVINGS  PREDICTED 
..     BY  MODELS  I,  II  AND  II'  USING  MOD  III 


MODEL 

11-72 

111-72 

IV- 7  2 

1-73 

11-73 

TOTAL 

I 

41.94 

42.74 

43.48 

44.16 

44.79 

AMOUNT 
OF 

II 

41.90 

42.64 

43.31 

44.80 

46.19 

SAVINGS 

II' 

41.90 

42.55 

43.06 

44.28 

45.37 

ACTUAL 

41.57 

42.15 

42.40 

44.13 

44.56 

The  predictive  power  of  each  model  in  predicting  the  size  dis- 
tribution of  the  population  could  not  be  compared  as  the  validation  sample 
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was  also  used  in  estimating  the  parameters  of  MOD  III.     Thus,  another 
sample  would  have  to  be  taken  to  validate  this  capability  of  the  two 
models.    It  is  regrettable  that  this  step  could  not  be  carried  out  at  the 
time  of  the  writing  of  this  report  because  of  lack  of  time.     It  is  therefore 
proposed  that  the  models  be  validated  again  at  a  later  date. 
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V.     SUMMARY  AND  CONCLUSIONS 

A.         SUMMARY 

The  purpose  of  this  research  has  been  to  develop  a  model  that  can 
be  used  to  study  the  structure  of  a  population  of  savings  accounts  in  a 
savings  institution  and  to  predict  future  levels  of  savings  in  the  insti- 
tution . 

Two  stochastic  models  were  developed  and  evaluated  in  this  study. 
The  first  model  was  based  on  the  time  stationary  Markov  chain  model 
extended  to  cover  the  phenomena  of  opening  and  closing  of  accounts. 
The  population  was  divided  into  ten  classes  and  the  continuous  distribu- 
tion of  amount  of  savings  of  each  account  was  idealized  by  a  discrete 
distribution  with  ten  classes.     The  classes  were  numbered  from  two  to 
eleven.    The  class  intervals  of  Classes  II  to  IX  were  $2,000.     Class  X 
contained  all  accounts  with  balances  between  $16,000  and  $19,999  and 
Class  XI  contained  all  accounts  with  balances  between  $2  0,000  and 
$100,000.     Class  I  was  used  as  a  reservoir  for  all  the  accounts  that  had 
closed.     The  parameters  of  Model  I  were  assumed  to  be  constant  over  the 
period  of  observation  and  prediction. 

The  second  model  was  based  on  the  nonstationary  Markov  chain 
model.     The  parameters  were  not  assumed  to  be  constant.    An  econometric 
model  was  used  to  relate  the  estimates  of  the  parameters  to  a  set  of 
exogenous  variables.    Predictors  of  the  parameters,  if  found  to  be  signi- 
ficant, were  used  to  predict  future  values  of  the  parameters. 
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By  assuming  that  the  mean  of  the  amount  of  savings  of  accounts  in 
each  class  remain  constant  with  time  the  total  amount  of  savings  in  each 
class  could  be  computed  by  multiplying  the  number  of  accounts  in  each 
class  by  the  mean. 

The  parameters  of  the  two  models  were  estimated  with  data  obtained 
from  the  local  branch  of  a  savings  institution.     The  level  of  savings  of  a 
stratified  sample  of  622  accounts  were  observed  over  a  period  of  ten 
quarters,  Quarter  1-71  to  Quarter  11-73 .     Movements  of  accounts  between 
classes  were  recorded  as  transitions  between  the  respective  classes.    The 
transition  probability  matrix  was  estimated  by  dividing  the  number  of 
transitions  from  each  class  by  the  total  number  of  accounts  in  the  class 
at  the  beginning  of  the  quarter. 

The  total  number  of  new  accounts  opened  in  each  quarter  of  the 
period  of  observation  was  used  to  estimate  the  arrival  rate  or  expected 
number  of  new  accounts  per  quarter. 

Two  hundred  and  fifty  new  accounts  were  randomly  selected  each 
quarter.    These  were. used  to  determine  if  the  size  distribution  of  new 
accounts  had  changed  during  the  period  of  observation.     These  accounts 
were  classified  into  the  ten  classes  described  earlier  and  the  probability 
of  a  new  account  being  in  each  class  estimated.    These  estimates  were 
transformed  into  logits  and  regressed  against  a  set  of  exogenous  variables. 
The  regressions  that  were  considered  significant  were  used  as  predictors 
for  future  values  of  the  probability  of  a  new  account  entering  a  particular 
class . 
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The  structure  of  the  population  of  savings  accounts  for  Quarter  1-72 
was  determined  and  used  as  the  initial  distribution  in  predictions  of  the 
behavior  of  the  population. 

The  chi  square  test  was  used  to  determine  if  the  transition  matrix 
had  changed  during  the  period  of  observation  and  if  the  predicted  size 
distributions  matched  the  observed  distributions. 

The  parameters  of  Model  I  were  estimated  using  data  from  the  first 
five  quarters.     The  model  was  then  used  to  predict  the  size  distribution 
of  accounts  of  the  sample  and  the  amount  of  savings  in  the  sample  popu- 
lation. 

The  size  distribution  of  the  population  of  savings  accounts  was 
predicted  using  the  distribution  of  the  population  at  Quarter  1-72  as  the 
initial  distribution.     Total  number  of  accounts  and  total  amount  of  savings 
were  also  predicted. 

Most  of  the  parameters  of  Model  II  were  estimated  using  data  from 
the  first  five  quarters.    Of  110  transition  probabilities  10  were  found  to 
vary  significantly  with  the  set  of  exogenous  variables.     Thus  the  transition 
matrix  of  Model  II  contained  only  ten  predicted  elements.     The  predictors 
were  determined  using  data  from  the  first  eight  quarters. 

Model  II  was  used  to  predict  the  size  distribution  of  accounts  in 
the  sample  and  the  amount  of  savings  in  the  sample.  It  was  then  used 
to  predict  the  behavior  of  the  population. 
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A  sample  comprising  one  fourth  of  the  population  of  Quarter  1-73 
was  used  to  test  if  the  size  distribution  predicted  by  both  models  were 
any  good.     Predicted  total  number  of  accounts  and  total  amount  of  savings 
were  also  tested  by  comparison  with  actual  values  observed  over  the 
prediction  horizon. 

B.         CONCLUSIONS 
1.         Model  I 

The  hypothesis  that  the  stochastic  processes  were  stationary 
during  the  period  of  observation  could  not  be  rejected  at  the  ten  percent 
level  of  significance.     Thus  the  assumption  of  stationarity  could  be  con- 
sidered to  hold . 

The  predicted  size  distribution  of  the  sample  matched  the 
observed  distribution  closely.     The  largest  chi  square  statistic  obtained 
was  11.91.     This  corresponded  to  the  seventieth  percentile  of  the  chi 
square  distribution  with  ten  degrees  of  freedom.     It  was  concluded  that 
the  sample  of  622  accounts  behaved  as  described  by  the  Markov  chain 
model. 

The  predicted  total  amount  of  savings  differed  from  the  actual 
amount  by  a  maximum  of  ten  percent.    It  was  concluded  that  Model  I  could 
predict  total  amount  of  savings  but  the  variability  in  the  prediction  could 
be  rather  large  as  a  small  number  of  savers  with  large  accounts  could 
cause  large  fluctuations  in  the  total  amount  of  savings  . 

Model  I  failed  to  predict  the  behavior  of  the  population.  The 
failure  was  attributed  to  errors  in  estimation  of  parameters  of  the  transition 
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matrix.     This  observation  was  supported  by  the  fact  that  predictions 
were  substantially  improved  by  changing  the  values  of  some  transition 
probabilities.     The  additional  data  in  the  validation  sample  was  used  to 
adjust  the  estimates  of  a  few  transition  probabilities.     Predictions  of 
total  amount  of  savings  made  with  this  modified  matrix  were  greatly 
improved.    The  maximum  error  was  found  to  be  half  a  percent.    A  good 
fit  between  predicted  and  total  amount  of  savings  by  itself  is  not  suf- 
ficient to  indicate  that  the  model  has  predicted  the  size  distribution  of 
the  population  correctly.     However,  as  the  predicted  size  distribution 
of  the  population  of  Quarter  1-73  has  been  made  to  fit  the  observed  dis- 
tribution and  if  the  structure  of  the  population  did  not  change  drastically, 
over  the  period  of  observation,  then  it  is  plausible  that  the  true  transition 
matrix  is  not  very  different  from  the  modified  matrix.    It  is  regrettable 
that  time  did  not  permit  the  drawing  of  further  samples  to  validate  the 
model  so  that  a  firmer  conclusion  could  be  reached. 

The  fundamental  matrix,  obtained  from  the  'best'  estimate  of 
the  transition  matrix,  predicted  that  the  maximum  total  number  of  accounts 
in  the  institution  will  be  19363,  and  the  maximum  total  amount  of  savings 
contributed  by  accounts  below  $100,000  will  be  $53.74  million,  if  the 
conditions  existing  during  the  period  of  the  data  were  to  persist. 

The  average  time  an  account  remains  opened  was  predicted 
to  be  27.6  quarters,  6.9  years.     The  expected  length  of  stay  of  an  account, 
in  the  system,  appeared  to  be  independent  of  the  amount  of  savings  in 
the  account  when  it  first  joined  the  system  except  if  the  amount  was 
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less  than  $2,000  or  more  than  $20,000.     It  was  concluded  that  a  saver's 
desire  to  remain  a  customer  of  the  institution  did  not  depend  on  his  initial 
deposit. 

A  small  increase  in  the  expected  length  of  stay  of  an  account, 
in  the  system,  could  have  a  large  effect  on  the  total  amount  of  savings. 
Thus  efforts  to  keep  customers  contented  and  remain  longer  in  the  system 
are  important. 

2.         Model  II 

The  predicted  size  distributions  of  the  sample  were  very  close 
to  the  observed  distribution  for  the  first  four  periods.     The  maximum  chi 
square  statistic  was  6.7  which  is  less  than  the  thirtieth  percentile  of 
the  chi  square  distribution  with  ten  degrees  of  freedom.    The  chi  square 
statistic  for  the  fifth  quarter,  Quarter  11-73  shot  up  to  2  5.02.    An  inves- 
tigation showed  that  the  model  failed  because  five  of  the  predictors  of 
transition  probability  were  used  beyond  the  data  base  on  which  they  were 
derived  thus  giving  erroneous  predictions  for  Quarter  11-73.    It  was 
therefore  concluded  that  Model  II  could  predict  accurately  provided  the 
predictors  are  not  required  to  predict  beyond  the  data  base  on  which  they 
were  derived. 

The  maximum  percentage  of  error  in  predicting  the  total  amount 
of  savings  was  about  ten.    The  predictions  were  very  close  to  the  pre- 
dictions made  by  Model  I. 

Model  II  fared  no  better  than  Model  I  in  the  prediction  of 
population  behavior  and  for  the  same  reasons  as  stated  earlier. 
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3 .         Discussion 

Both  models  performed  credibly  in  predicting  the  behavior  of 
the  sample  of  622  accounts.    This  is  encouraging  as  it  leads  one  to  con- 
clude that  a  population  of  savers  does  possess  the  Markovian  property. 

Failure  of  the  models  to  predict  the  behavior  of  the  entire 
population  correctly  was  attributed  to  errors  in  the  estimation  of  par- 
ameters .       This  explanation  is  plausible,  as  modifications  to  the  transition 
matrix,  using  additional  data  from  the  validation  sample,  yielded  pre^- 
dictions  of  total  amount  of  savings  that  were  accurate  to  half  a  percent. 
As  it  is  difficult  to  conceive,  how  a  random  sample  could  exhibit  the 
Markovian  behavior,  with  the  population  not  possessing  that  characteristic, 
one  is  further  led  to  believe  in  the  above  explanation. 

If  external  conditions  do  not  have  much  influence  on  the  be- 
havior of  the  population  of  savers  then  Model  I,  because  of  its  simplicity, 
is  the  ideal  model  to  use.     Model  I  could  still  be  used  if  the  rate  of 
change  of  the  population  behavior  is  slow.     Transition  probabilities 
could  be  estimated  each  quarter  and  exponential  smoothing  used  to  adjust 
the  past  estimates  with  this  additional  information.     However,  this  model 
does  not  allow  the  use  of  additional  information  regarding  the  operating 
environment  to  improve  the  predictions. 

Model  II  has  not  been  given  an  opportunity  to  demonstrate 
its  capability  because  of  the  limited  data  base.    It  has  the  advantage 
of  improvement  with  additional  knowledge  of  the  operating  environment. 
However,  its  main  limitation  is  in  the  requirement  of  predictions  of 
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values  of  exogenous  variables  to  predict  future  values  of  the  parameters 
of  the  model.     Thus,  predictions  of  Model  II  are  only  as  good  as  predic- 
tions of  exogenous  variables.     The  success  of  the  model,  therefore, 
depends  to  a  great  extent  on  the  judgement  of  the  forecaster. 
4  .         Areas  for  Further  Research 

The  Markovian  property  of  a  population  is  an  important  popu- 
lation characteristic.     The  results  observed  in  the  application  of  the  models 
to  the  sample  should  be  verified  using  a  larger  number  of  accounts,  pref- 
erably the  entire  population.    A  computerized  bookkeeping  system  should 
be  able  to  take  on  the  additional  task  of  counting  the  number  of  transitions 
between  classes  without  much  additional  effort. 

The  variability  of  predictions  in  total  amount  of  savings 
could  be  reduced  if  the  movement  of  large  accounts  could  be  predicted. 
Accounts  with  a  balance  exceeding  $100,000  could  be  the  subject  of 
another  study. 

The  present  study  did  not  deal  with  the  interaction  between 
various  types  of  accounts  in  a  savings  association.     Movement  of  accounts 
between  different  types  of  accounts  has  an  impact  on  the  total  amount  of 
savings  in  the  institution.    This  area  merits  further  research  especially 
if  management  desires  to  know  the  future  level  of  savings  of  the  whole 
institution. 

The  variance  of  the  predictions  for  more  than  one  period  is 
difficult  to  derive  as  the  elements  of  the  transition  matrix  are  sums  of 
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products  of  normal  random  variables,  when  the  sample  size  is  large. 
An  alternate  approach  would  be  to  use  the  Monte  Carlo  method  to  obtain 
an  estimate  of  the  variance. 

The  specification  of  the  econometric  models  used  in  predicting 
the  transition  probabilities,  arrival  rate  and  distribution  of  new  accounts 
does  not  imply  that  the  true  relationships  between  parameters  of  the  model 
and  exogenous  variables  are  as  specified.    This  study  has  merely  scratched 
the  surface  of  the  problem  of  identifying  casual  relationships  between  the 
parameters  of  the  model  and  external  factors.     Further  research  in  this 
area  is  necessary  before  reliable  predictors  can  be  developed  for  the 
parameters . 

C.         RECOMMENDATIONS 

Model  I  can  be  turned  into  an  operational  tool  with  little  effort.    It 
is  recommended  that  the  parameters  of  the  model  be  updated  each  quarter 
to  reflect  slight  changes  that  may  have  taken  place.     If  possible,  the 
entire  population  be  used  to  estimate  the  parameters. 

Model  II  can  be  made  operational  only  after  further  research  has 
been  conducted  to  determine  the  predictors  of  the  parameters  of  the  model. 
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APPENDIX  A 

DERIVATION  OF  THE  VARIANCES  OF  NUMBER  OF  ACCOUNTS 
AND  AMOUNT  OF  SAVINGS  IN  THE  POPULATION  FOR  SINGLE 

STEP  TRANSITION 

(1)        EXPECTATION,  VARIANCE  AND  COVARIANCE  OF  RANDOM  SUMS 

Let       N  be  an  integer  random  variable 

M         be  an  integer  random  variable 


X. 

1 

be  i .  i . d . 

Y. 
J 

be  i.i.d. 

X    = 

N 

I    * 

Y    = 

M 

N  M 

E(XY)         =        E(      V    X.        T      Y.) 

i=l  j=l 

N  M 

=    e(  y    v  X'Y) 

E(MN)E(X.Y.) 
i   J 

Cov(X#Y)      =        E(XY)  +  E(X)E(Y) 

E(MN)E(X.Y.)  +  E(N)E(X.)E(M)E(Y.) 

If  X.  and  Y.  are  uncorrelated  then 
i  J 

Cov(X/Y)       =        E(MN)E(X.)E(Y.)  +  E(N)E(M)E(X.)E(Y.) 

i         J  i         J 

E(X.)E(Y.)(E(MN)  +  E(M)E(N)) 
i         J 

E(X.)E(Y.)Cov(M/N) 
i         J 

Var(X)        =     .  E2(X.)Var(N)  +  E(N)Var(X.) 
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Note: 


E(X) 

E(N)E(X.)    can  be  de 
1 

E(X) 

CO 

]T     E(x|N=n)P(N=n) 
n=0 

00 

^T     nE(X)P(N=n) 
n=0 

E(N)E(X) 
(2)        EXPECTATION  AND  VARIANCE  OF  NUMBER  OF  ACCOUNTS 
Let  n.  =        number  of  accounts  in  the  ith  class  at  beginning 

of  time  period  a . 
p..  =        transition  probability  between  classes  i  and  j. 

i  =  2  ,  3  ,    .  .  .  m  ,  j  =  1 , 2  ,    ...m 

x. .  =        number  of  transitions  between  classes  i  and  j 

iJ 

during  period  a . 

a+1 

n.  =        number  of  accounts  in  the  jth  class  at  beginning 

of  time  period  a+1. 

a+1 
N  =       total  number  of  accounts  in  the  system  at 

beginning  of  time  period  a+1. 

The  assumption  that  accounts  moving  out  of  a  class  are  distributed  in 

accordance  with  a  multinomial  distribution  with  parameters  (p.„  p., 

p. „    ...  p.    )  is  implicit  in  the  Markov  chain  model.     If  it  can  be  further 
i2  1m 

assumed  that  accounts  moving  out  of  different  classes  are  independent 
then  the  following  expressions  could  be  obtained. 
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n, 


a+l 


m 


x. 


ij 


E(n;+1) 


m 


E(x..) 


m 

}      n.p.. 


Var(na+1) 


m 


m 


Var(x..) 


n.p..(l  -  p..) 
1   ij  ij 


Cov(x.  ,,x,  ,)  =  0  by 
ij      kj 

assumption  of  independence 
between  accounts  exiting 
from  different  classes 


N 


a+l 


a+l 


E(Na+1) 


1    <  ' 

=  %    X  nipu 


Var(Na+1) 


j=2  i=2 


J     Var(na+1 
j=2  J 


)  +  2 


m-1 


j=7       k=3 


_      ,  a+l      a+L 
Cov(n.        ,n,       ) 
J  k 


Cov(n.       ,n        )   =        Cov( 
J  k 


/— —       i  i '         £- — 

i=2        J  1=2 


V 


m  m  m 

(  y  x..   y  x., )  -  e(  >  x.. 


m 


,)E(    >     x     ) 
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■  i  i  e<v lk>  - 1  1 


E(x..)E(x     ) 

1^2       3=2         iJ    *"  i=2       1=2         1J         L 


m         m 


=    I    I  E<Vik' "  E(xik,Ebcik) 

1       ^  1  —  Z 


m         m 


)  >      Cov(x,.,x     ) 

fe       fe  1J      lk 


By  assumption  Cov(x. .  ,x     )  =  0  if  i  ^  1 

1J         IK 

m 
_      .  a+1     a+1.  v~  ,  . 

Cov(n.       ,n.      )=         >      Covx.  ,,x., 
J  J  A>  ij      ik 

As  x, .  and  x..    are  multinomial  random  variables  from  the  same  distribution 
ij  ik 

Cov(x.  .,x     )         =        -  n.p.  .p.. 
ij      ik  i   ij    ik 


.   .      _      ,  a+1      a+i.  \r~ 

Cov(n.       ,n.       )  =        >     -  n.p. .p. 

i=2 
U 


(3)        EXPECTATION  AND  VARIANCE  OF  AMOUNT  OF  SAVINGS 

Let  z,  .  =        size  of  the  kth  account  that  has  entered  the  jth  class 

kj 

a+1 
Z.  =        amount  of  savings  in  class  j  at  the  beginning  of 


) 


period  a+1 


Z  =        total  amount  of  savings  in  the  system  at  the  beginning 

of  period  a+1 

...  m       _x. . 

.a+1 


2a+l  _       y      J1} 


[=2       k=l        ki 


E(Z-1)  =      J        E  (  f[    zkJ, 
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ni 

=       \        E(x..)E(z     )  (using  results  from  (1)) 

4 —  o  11  KJ 


Var(Za+1) 


1^2 


X.  .  X. 


m-1     m 
I     Var(YJz     )  +  2     £       V   Cov(VJ  z      ^z     ) 

i^2  £=1   KJ  i="2       1=3  fe=l    RJ   k=l  kj 


r1-  2 

>      E(x.  .)Var(z,  .)  +  E    (z,  .)Var(x..) 

,4ro  XJ  kJ  kJ  lJ 


m         m_ 

>      E   (z,  ,)Cov(x..xn.) 
-3  kj  ij    lj 


i= 


n.p..Var(z.  .)  +  E    (z.  ,)n,p..(l  -  p..) 
i   ij  kj  kj     i   ij  ij 


The  covariance  terms  drop  out  as  Cov(x,,,x,  .)  =  0  if  i  ^  1 

U      lj 


m         x. 


m  x 


cov(z*+\z*+i)=  cov(X  x%,  y  r\j 

J  X  i=2      fei   KJ       n^      TT=1   ki 


m  m 


x. 


L  >-  Covlij  v 


z    J 


i=2       n=2  k=l 


n=l 


m 

=       Y      E(z,  .)E(z    jCovix.^x.J 
f^-o  kj         nl  ij      ll 

i —  ^ 


m 

I 

i=2 


in 


Var(Za+1) 


m  ..  m-1       m  , .  ,  , 

f    Var(Za+1)   +  2    f        f    Cov(Za+1  ,Za+1) 
j=2  J  j=2       1-3  J 
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